Introduction
This vignette demonstrates how to use ASAP functions for analyzing single WAV files of zebra finch vocalizations. While ASAP is designed for large-scale longitudinal studies, all core analysis functions work directly with individual audio files, making it easy to explore and understand your data.
We’ll cover:
- Audio Visualization - View spectrograms
- Bout Detection - Find singing periods
- Syllable Segmentation - Detect individual syllables
- Spectral Entropy - Measure spectral structure
- Fundamental Frequency - Extract pitch contours
- Amplitude Envelope - Analyze temporal dynamics
Setup
library(ASAP)
# Get path to example WAV file included with the package
wav_file <- system.file("extdata", "zf_example.wav", package = "ASAP")1. Audio Visualization
The visualize_song() function creates spectrogram
visualizations of audio recordings. This is typically the first step in
exploring your data.
Full recording
# Visualize the entire recording
visualize_song(wav_file)
#> Song visualization completed for: zf_example.wav
Specific time window
You can focus on a specific time range using
start_time_in_second and
end_time_in_second:
# Visualize a 3-second segment
visualize_song(wav_file,
start_time_in_second = 1,
end_time_in_second = 4)
#> Song visualization completed for: zf_example.wav
2. Bout Detection
A “bout” is a continuous period of singing. The
find_bout() function automatically detects bout boundaries
using RMS (root mean square) amplitude thresholding with bandpass
filtering.
# Detect bouts in the recording
bouts <- find_bout(wav_file,
rms_threshold = 0.1, # Amplitude threshold (0-1)
min_duration = 0.7, # Minimum bout length in seconds
plot = TRUE) # Show detection plot
# View detected bouts as a table
knitr::kable(bouts, digits = 3)| filename | selec | start_time | end_time |
|---|---|---|---|
| zf_example.wav | 1 | 1.057 | 3.553 |
| zf_example.wav | 2 | 4.156 | 4.946 |
3. Syllable Segmentation
The segment() function detects individual syllables
within a specified time window using dynamic spectral thresholding.
# Segment syllables in a time window
syllables <- segment(wav_file,
start_time = 1, # Start time (seconds)
end_time = 5, # End time (seconds)
flim = c(1, 8), # Frequency limits (kHz)
silence_threshold = 0.01,
min_syllable_ms = 20, # Minimum syllable duration
max_syllable_ms = 240, # Maximum syllable duration
min_level_db = 10, # Starting threshold (dB)
verbose = FALSE)
# View detected syllables as a table
knitr::kable(syllables, digits = 2)| filename | selec | threshold | .start | .end | start_time | end_time | duration | silence_gap |
|---|---|---|---|---|---|---|---|---|
| zf_example.wav | 1 | 10 | 0.06 | 0.12 | 1.06 | 1.12 | 0.06 | NA |
| zf_example.wav | 2 | 10 | 0.15 | 0.20 | 1.15 | 1.20 | 0.05 | 0.03 |
| zf_example.wav | 3 | 10 | 0.26 | 0.31 | 1.26 | 1.31 | 0.05 | 0.06 |
| zf_example.wav | 4 | 10 | 0.36 | 0.41 | 1.36 | 1.41 | 0.05 | 0.05 |
| zf_example.wav | 5 | 10 | 0.44 | 0.52 | 1.44 | 1.52 | 0.08 | 0.03 |
| zf_example.wav | 6 | 10 | 0.55 | 0.61 | 1.55 | 1.61 | 0.06 | 0.03 |
| zf_example.wav | 7 | 10 | 0.62 | 0.84 | 1.62 | 1.84 | 0.21 | 0.01 |
| zf_example.wav | 8 | 10 | 0.87 | 0.98 | 1.87 | 1.98 | 0.11 | 0.04 |
| zf_example.wav | 9 | 10 | 0.99 | 1.05 | 1.99 | 2.05 | 0.06 | 0.01 |
| zf_example.wav | 10 | 10 | 1.09 | 1.22 | 2.09 | 2.22 | 0.12 | 0.04 |
| zf_example.wav | 11 | 10 | 1.33 | 1.54 | 2.33 | 2.54 | 0.20 | 0.12 |
| zf_example.wav | 12 | 10 | 1.67 | 1.74 | 2.67 | 2.74 | 0.07 | 0.13 |
| zf_example.wav | 13 | 10 | 1.75 | 1.84 | 2.75 | 2.84 | 0.09 | 0.02 |
| zf_example.wav | 14 | 10 | 1.86 | 1.92 | 2.86 | 2.92 | 0.06 | 0.02 |
| zf_example.wav | 15 | 10 | 1.93 | 2.16 | 2.93 | 3.16 | 0.22 | 0.01 |
| zf_example.wav | 16 | 10 | 2.18 | 2.29 | 3.18 | 3.29 | 0.11 | 0.03 |
| zf_example.wav | 17 | 10 | 2.31 | 2.36 | 3.31 | 3.36 | 0.05 | 0.01 |
| zf_example.wav | 18 | 10 | 2.40 | 2.54 | 3.40 | 3.54 | 0.14 | 0.04 |
| zf_example.wav | 19 | 10 | 3.18 | 3.24 | 4.18 | 4.24 | 0.06 | 0.63 |
| zf_example.wav | 20 | 10 | 3.27 | 3.34 | 4.27 | 4.34 | 0.07 | 0.03 |
| zf_example.wav | 21 | 10 | 3.35 | 3.44 | 4.35 | 4.44 | 0.09 | 0.01 |
| zf_example.wav | 22 | 10 | 3.46 | 3.51 | 4.46 | 4.51 | 0.06 | 0.02 |
| zf_example.wav | 23 | 10 | 3.53 | 3.75 | 4.53 | 4.75 | 0.23 | 0.01 |
| zf_example.wav | 24 | 10 | 3.78 | 3.88 | 4.78 | 4.88 | 0.10 | 0.02 |
| zf_example.wav | 25 | 10 | 3.90 | 3.95 | 4.90 | 4.95 | 0.05 | 0.02 |
4. Spectral Entropy Analysis
Spectral entropy measures the “randomness” or structure in the frequency distribution. Harmonic sounds (like syllables) have low entropy, while noisy sounds have high entropy.
# Calculate spectral entropy for a segment
entropy_result <- spectral_entropy(wav_file,
start_time = 1.5,
end_time = 2.5,
method = "wiener", # or "shannon"
normalize = TRUE,
plot = TRUE)
5. Fundamental Frequency (Pitch) Analysis
The FF() function extracts the fundamental frequency
(F0) contour, which represents the perceived pitch of the vocalization
over time.
# Extract fundamental frequency
pitch_result <- FF(wav_file,
start_time = 1.5,
end_time = 2.5,
method = "cepstrum", # Cepstral analysis
fmax = 1400, # Maximum F0 to detect (Hz)
threshold = 10, # Confidence threshold
plot = TRUE)
6. Amplitude Envelope
The amplitude envelope represents the temporal dynamics of sound
intensity. Use amp_env() on segmented data to extract
envelope profiles.
# Extract amplitude envelope for the second bout
if (!is.null(bouts) && nrow(bouts) >= 2) {
env_bout <- amp_env(bouts[2, ],
wav_dir = dirname(wav_file),
msmooth = c(256, 50),
norm = TRUE,
plot = TRUE)
}
We can also extract the envelope for individual syllables:
# Extract amplitude envelope for the 7th syllable
if (!is.null(syllables) && nrow(syllables) >= 7) {
env_syl <- amp_env(syllables[7, ],
wav_dir = dirname(wav_file),
msmooth = c(256, 50),
norm = TRUE,
plot = TRUE)
}
Putting It All Together
Here’s a typical workflow for single-file analysis:
library(ASAP)
# 1. Load and visualize
wav_file <- "path/to/your/recording.wav"
visualize_song(wav_file, start_time_in_second = 0, end_time_in_second = 10)
# 2. Detect bouts
bouts <- find_bout(wav_file, rms_threshold = 0.1, min_duration = 0.7)
# 3. Segment syllables
if (!is.null(bouts) && nrow(bouts) > 0) {
syllables <- segment(wav_file,
start_time = bouts$start_time[1],
end_time = bouts$end_time[1],
min_level_db = 10)
}
# 4. Analyze spectral features
entropy <- spectral_entropy(wav_file,
start_time = bouts$start_time[1],
end_time = bouts$end_time[1])
pitch <- FF(wav_file,
start_time = bouts$start_time[1],
end_time = bouts$end_time[1])
# 5. Extract amplitude envelope
env <- amp_env(bouts[1, ],
wav_dir = dirname(wav_file),
msmooth = c(256, 50),
norm = TRUE,
plot = TRUE)Session Info
sessionInfo()
#> R version 4.5.2 (2025-10-31)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.3 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
#>
#> locale:
#> [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8
#> [4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8
#> [7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C
#> [10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: UTC
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] ASAP_0.3.4
#>
#> loaded via a namespace (and not attached):
#> [1] sass_0.4.10 generics_0.1.4 tidyr_1.3.2 lattice_0.22-7
#> [5] digest_0.6.39 magrittr_2.0.4 evaluate_1.0.5 grid_4.5.2
#> [9] RColorBrewer_1.1-3 fastmap_1.2.0 jsonlite_2.0.0 Matrix_1.7-4
#> [13] tuneR_1.4.7 purrr_1.2.1 scales_1.4.0 pbapply_1.7-4
#> [17] textshaping_1.0.5 jquerylib_0.1.4 cli_3.6.5 rlang_1.1.7
#> [21] pbmcapply_1.5.1 fftw_1.0-9 seewave_2.2.4 cachem_1.1.0
#> [25] yaml_2.3.12 av_0.9.6 tools_4.5.2 parallel_4.5.2
#> [29] dplyr_1.2.0 ggplot2_4.0.2 reticulate_1.45.0 vctrs_0.7.1
#> [33] R6_2.6.1 png_0.1-8 lifecycle_1.0.5 fs_1.6.7
#> [37] MASS_7.3-65 ragg_1.5.1 pkgconfig_2.0.3 desc_1.4.3
#> [41] pkgdown_2.2.0 pillar_1.11.1 bslib_0.10.0 gtable_0.3.6
#> [45] glue_1.8.0 Rcpp_1.1.1 systemfonts_1.3.2 xfun_0.56
#> [49] tibble_3.3.1 tidyselect_1.2.1 knitr_1.51 farver_2.1.2
#> [53] htmltools_0.5.9 patchwork_1.3.2 rmarkdown_2.30 signal_1.8-1
#> [57] compiler_4.5.2 S7_0.2.1