update readme

universome · Aug 6, 2021 · 762786f · 762786f
1 parent 6720a98
commit 762786f
Showing 1 changed file with 23 additions and 7 deletions.
diff --git a/README.md b/README.md
@@ -1,35 +1,51 @@
+
 # HDTF
 Flow-guided One-shot Talking Face Generation with a High-resolution Audio-visual Dataset 
-<a href="https://openaccess.thecvf.com/content/CVPR2021/papers/Zhang_Flow-Guided_One-Shot_Talking_Face_Generation_With_a_High-Resolution_Audio-Visual_Dataset_CVPR_2021_paper.pdf" target="_blank">paper</a> 
+<a href="https://openaccess.thecvf.com/content/CVPR2021/papers/Zhang_Flow-Guided_One-Shot_Talking_Face_Generation_With_a_High-Resolution_Audio-Visual_Dataset_CVPR_2021_paper.pdf" target="_blank">paper</a>    <a href="https://github.com/MRzzm/HDTF/blob/main/Supplementary%20Materials.pdf" target="_blank">supplementary</a>
 
 ## Details of HDTF dataset
-**./HDTF_dataset** consists of *youtube video url*, *time stamps of talking face* and *facial region* in the video.
+**./HDTF_dataset** consists of *youtube video url*, *video resolution* (in our method, may not be the best resolution), *time stamps of talking face*, *facial region* (in the our method) and *the zoom scale* of the cropped window.
 **xx_video_url.txt:** 
 
 
 ```
 format:     video name | video youtube url
 ```
+**xx_resolution.txt:**
+```
+format:    video name | resolution(in our method)
+```
 
 **xx_annotion_time.txt:**
 ```
 format:    video name | time stamps of clip1 | time stamps of clip2 | time stamps of clip3....
 ```
 **xx_crop_wh.txt:**
 ```
-format:    video name+clip index | min_width | width |  min_height | height
+format:    video name+clip index | min_width | width |  min_height | height (in our method)
+```
+**xx_crop_ratio.txt:**
+```
+format:    video name+clip index | window zoom scale
 ```
+
+
 ## Processing of HDTF dataset
 When using HDTF dataset, 
 
- 1. We provide video and url in  **xx_video_url.txt**. (the highest definition of videos are 1080P or 720P).  Transform video into **.mp4** format and transform interlaced video to progressive video as well.
+ - We provide video and url in  **xx_video_url.txt**. (the highest definition of videos are 1080P or 720P).  Transform video into **.mp4** format and transform interlaced video to progressive video as well.
+
+ - We split long original video into talking head clips with time stamps in **xx_annotion_time.txt**.  Name the splitted clip as **video name_clip index.mp4**. For example, split the video  *Radio11.mp4 00:30-01:00 01:30-02:30*  into *Radio11_0.mp4* and *Radio11_1.mp4* .
+
+ - Our work does not always download videos with the best resolution, so we provide two cropping methods. Thanks @universome and @Feii Yin for pointing out this problem! 
 
-2. We split long original video into talking head clips with time stamps in **xx_annotion_time.txt**.  Name the splitted clip as **video name_clip index.mp4**. For example, split the video  *Radio11.mp4 00:30-01:00 01:30-02:30*  into *Radio11_0.mp4* and *Radio11_1.mp4* .
+	1. Download the video with reference resulotion in **xx_resolution.txt** and crop the facial region with fixed window size in **xx_crop_wh.txt**. (This method is as same as ours, but the downloaded video may not be the best resolution).
+	2. First, download the video with best resulotion. Then, detect the facial landmark in the splitted talking head clips and count the square window of the face, specifically, count the facial region in each frame and merge all regions into one square range. Next,  enlarge the window size with **xx_crop_ratio.txt**. Finally, crop the facial region. 
 
-3. We crop the facial region with fixed window size in **xx_crop_wh.txt** and resize the video into **512 x 512** resolution.
+- We resize all cropped videos into **512 x 512** resolution.
 
 
-The HDTF dataset is available to download under a <a href="https://creativecommons.org/licenses/by/4.0/" target="_blank"> Creative Commons Attribution 4.0 International License</a>.
+The HDTF dataset is available to download under a <a href="https://creativecommons.org/licenses/by/4.0/" target="_blank"> Creative Commons Attribution 4.0 International License</a>. If you face any problems when processing HDTF, pls contact me.
 
 ## Reference
 if you use HDTF, pls reference