-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How is the audio recording stored? #93
Comments
Hello @gomingchen Regarding the timestamps they should be valid for the given recording (start to end). @georges-berenger Any comment regarding a cutoff time? |
I'm not sure how you're reading the audio data, or how you see "the interval between neighboring timestamps is 33333.3 ms" (which is 33.333 seconds, but I assume you actually meant 33.333 ms), or how you get "28672 samples". The audio records should have 147465 bytes total, 32777 bytes of metadata in a datalayout content block, followed by 114688 bytes of audio data. If you're using a |
Hi @georges-berenger and @SeaOtocinclus , Thank you for your replies. It's very helpful to learn the details. It was a typo, which should be "us". I read it three times before submitting it but never noticed it. This is how I got 28672 (=4096 X 7) samples. By sample, I mean a datapoint in audio recording.
The Dataset is recording_1 in everyday activity in sample aria dataset. The main function:
This
So I got a matrix with dimension 3549 (number of timestamps) X 28672. Notice how they repeat every two or three blocks? There are only four unique recordings in the first 10 blocks.
I tried to find the pattern. Call this time difference
See line 8, the index of unique blocks changes to 3 when the time difference is 2.73. I have also manually labeled tens. I think the cutoff is around 0.65. The following is MATLAB code for me to decide:
I think after all I cheated on this. There are better ways to read the audio recordings by using the real timestamps of them. @SeaOtocinclus Could you please give me a few lines as an example about how to use AriaAudioPlayer class? I found some documentation about pyvrs.reader module. Is that where I should look at? |
In order to be sure about you're looking at, let's use the command line tools that allow you to look at the data. Part of the vrs open source project, there is the "vrs" command line tool, which allows you to peek at the data. Here, we need to inspect audio data only, which means we only need to look at streams with the RecordableTypeId 231 (there should be only one such streams in your files), so we will add the "+ 231" options to commands. To limit the time range of what we're looking at, let's only look at the first 1 second of data records, with the "--before +1" options. Please make sure to use these exactly as indicated, with the plus sign and spaces. The last command will also validate that the file can be read correctly.
Please provide me with the 3 outputs so we can talk about the same thing, I know for sure what you're looking at. |
Thank you, @georges-berenger ! My commentsThe three fractional zeros after the timestamps are added by MATLAB when it reads the csv file. The timestamps are from a file called "timestamp_map.csv" in "synchronization" folder under "recording_1" folder. If you download aria sample dataset, there are two folders under "everyday activity", one is "recording_1", and another is "recording_2". Exactly - 85.33ms apart between every 4096 samples for one mic or 28672 samples collected by all 7 mics. That's what I was asking, since 85.33 ms is longer than 33.33 ms (the time interval between neighboring vrs timestamps), what are the rules of audio recordings being retrieved based on vrs timestamps? Or are there audio timestamps that are 85.33 ms apart? How can I read audio recordings without trying to find the repetitions? Thank you for your time! Output of the three commandsOutput of Output of vrs recording.vrs + 231 --before +1 Output of Those are timestamps of each audio sample in the first second(?)" |
I still don't see where the 33.33 ms comes from. I got a command wrong: I meant to ask for: |
Thanks, @georges-berenger. Could you give an example of a few lines of code about how to retrieve audio records in a VRS file? |
First, since we're talking about Aria files, you probably want to use the Aria data tools, which has a specialized |
I believe we can call this issue resolved. |
Hi folks,
Amazing work here. But I have trouble reading the microphone recordings. So I can get the audio data by using
readDataRecordByTime(mic_stream_id, timestamp)
, and got a matrix with dimensions (number of timestamps) X 28672. I know there are 7 microphones, and I assume the recordings from the 7 channels are stored like (mic0, mic1,...mic6) sample-wise, but I also observe the class is calledstereoaudiorecordable
class, and there seem to be repetitions of the same sound block, and they are 4096 samples apart. After taking every other block of 4096 samples, the one-mic recording still sounds weird. I used the everyday activity file in the sample dataset. How can I correctly extract data from each microphone? Thank you very much!Update: I got most of it, but have one question: is the cutoff time 0.65?
I used the time stamps provided in the CSV file "timestamp_map" which I think is not for microphones. 4096 samples equal to 85333 ms, while the interval between neighboring timestamps is 33333.3 ms which is shorter than 85333. That's why I am seeing repeating blocks of 28672 samples either every two or three blocks. I plotted many and found a pattern that retrieved the Nth recording if the fraction of the time difference
(timestamp[i] - timestamp[0])*1e-3/(4096/fs)/1e6
is less than 0.65 (the integer of the time difference is N), or it'll be (N+1)th. This assigns an index for each block read from the time stamps, so I can only choose the unique ones. I got a decent recording that sounds like normal speech. Just want to double-check, is the cutoff fraction 0.65 correct? Or is it some other number?Thank you. Much appreciate any response from your team!
The text was updated successfully, but these errors were encountered: