A joint collaboration between:
Image Processing Group at Universitat Politecnica de Catalunya (UPC) | Computer Vision Group at Universitat de Barcelona (UB) |
How do computers learn ? Can the see, listean or reason the way humans do ? The Artificial Intelligence Reading Group (ReadAI) addresses these problems.
The reading group will meet at UPC and UB analyze a recent scientific publication. Each week, a different member of the group will prepare a set of slides to be shared and discussed with the attendees, who will have also previously read the paper. UPC-MET students will receive the corresponding ECTS according to their activity before and during each session. If you are interested in attending, send an e-mail to professor Xavier Giró from the UPC Image Processing Group.
Other reading groups with public listings: University of Texas, University of Toronto, University of British Columbia, Stanford University, The Berkeley View.
Usually on Mondays from 17:15 to 18:00 @ UPC Campus Nord in Building D5, Room 003
- ??/??/2017 (Xavi Giro) Yang, Jianwei, Anitha Kannan, Dhruv Batra, and Devi Parikh. "LR-GAN: Layered recursive generative adversarial networks for image generation." ICLR 2017
- ??/??/?? (Xavi Giró) Karessli, Nour, Zeynep Akata, Andreas Bulling, and Bernt Schiele. "Gaze Embeddings for Zero-Shot Image Classification.". CVPR 2017
- ??/??/2017 (Amaia Salvador) Lu, Jiasen, Caiming Xiong, Devi Parikh, and Richard Socher. "Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image Captioning." CVPR 2017
- 15/05/2017 (Santiago Pascual) Pascual, Santiago, Antonio Bonafonte, and Joan Serrà. "SEGAN: Speech Enhancement Generative Adversarial Network." arXiv preprint arXiv:1703.09452 (2017).
- 08/05/2017 (Alejandro Woodward) Yao-Hung Hubert Tsai and Liang-Kang Huang and Ruslan Salakhutdinov, "Learning Robust Visual-Semantic Embeddings" arXiv:1703.05908, 2017.
- 24/04/2017 (Alberto Bozal) Spampinato, Concetto, Simone Palazzo, Isaak Kavasidis, Daniela Giordano, Mubarak Shah, and Nasim Souly. "Deep Learning Human Mind for Automated Visual Classification." CVPR 2017.
- 20/03/2017 (Slides by Fran Roldan): Zhang, Hanwang, Zawlin Kyaw, Shih-Fu Chang, and Tat-Seng Chua. "Visual Translation Embedding Network for Visual Relation Detection." CVPR 2017. [demo]
- 13/03/2017 (Slides by Dídac Surís) Abu-El-Haija, Sami, Nisarg Kothari, Joonseok Lee, Paul Natsev, George Toderici, Balakrishnan Varadarajan, and Sudheendra Vijayanarasimhan. "Youtube-8m: A large-scale video classification benchmark." arXiv preprint arXiv:1609.08675 (2016). [project]
- 06/03/2017 (Slides by Alberto Bozal): Bashivan, Pouya, Irina Rish, Mohammed Yeasin, and Noel Codella. "Learning representations from EEG with deep recurrent-convolutional neural networks." ICLR 2016. [code]
- 20/02/2017 (Slides by Xunyu Lin): Misra, Ishan, C. Lawrence Zitnick, and Martial Hebert. "Shuffle and learn: unsupervised learning using temporal order verification." ECCV 2016. [code]
Usually on Fridays from 12:15 to 13:00 @ UPC Campus Nord in Building D5, Room 003
- 10/02/2017 (Fran Roldan): Fukui, Akira, Dong Huk Park, Daylen Yang, Anna Rohrbach, Trevor Darrell, and Marcus Rohrbach. "Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding." arXiv preprint arXiv:1606.01847 (2016). [code] [demo]
- 03/02/17 (Slides by Junting Pan) Marcella Cornia, Lorenzo Baraldi, Giuseppe Serra, Rita Cucchiara. Predicting Human Eye Fixations via an LSTM-based Saliency Attentive Model. arXiv 2016.
- 27/01/17 (Slides by Víctor Garcia) Nguyen A, Yosinski J, Bengio Y, Dosovitskiy A, Clune J (2016). Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space. arXiv 1612.00005.
- 16/12/16 (Invited: Adriana Romero) Jégou, Simon, Michal Drozdzal, David Vazquez, Adriana Romero, and Yoshua Bengio. "The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation." arXiv preprint arXiv:1611.09326 (2016). [code] [slides]
- 16/12/16 (Invited: Lluis Castrejon) Castrejón, Lluıs, Yusuf Aytar, Carl Vondrick, Hamed Pirsiavash, and Antonio Torralba. "Learning Aligned Cross-Modal Representations from Weakly Aligned Data." CVPR 2016 [Slides]
- 25/11/16 (Slides by Víctor Garcia) Isola, Phillip and Zhu, Jun-Yan and Zhou, Tinghui and Efros, Alexei A, "Image-to-Image Translation with Conditional Adversarial Nets". arXiv 2016.
- 18/11/2016 (Slides by Míriam Bellver) Miriam Bellver, Xavier Giro-i-Nieto, Ferran Marques, and Jordi Torres. "Hierarchical Object Detection with Deep Reinforcement Learning." In Deep Reinforcement Learning Workshop (NIPS). 2016.
- 28/10/16 (Slides by Míriam Bellver) Liu, Wei, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, and Scott Reed. "SSD: Single Shot MultiBox Detector." ECCV 2016. [code]
- 14/10/16 (Slides by Andrea Calafell and Eva Mohedano) Zhong, Yujie, Relja Arandjelović, and Andrew Zisserman. "Faces In Places: compound query retrieval." In BMVC 2016.
- 07/10/16 (Slidecast and Slides by Junting Pan) Zoya Bylinskii, Adrià Recasens, Ali Borji, Aude Oliva, Fredo Durand and Antonio Torralba "Where should saliency models look next?" ECCV 2016
- 30/09/16 (Slides by Víctor Garcia) Reed, Scott, Zeynep Akata, Xinchen Yan, Lajanugen Logeswaran, Bernt Schiele, and Honglak Lee. "Generative adversarial text to image synthesis." ICML 2016. [code]
- 23/09/16 (Slides by Albert Jiménez) Yang, Jianwei, Devi Parikh, and Dhruv Batra. "Joint Unsupervised Learning of Deep Representations and Image Clusters." CVPR 2016. [Torch]
- 09/09/2016 (Slides and Slidecast by Manel Baradad) Romera-Paredes, Bernardino, and Philip HS Torr. "Recurrent instance segmentation." ECCV 2016.
@ UB Plaça Universitat, Maths school, 2nd floor, Room T2
-
15/12/2016 (Eduardo Aguilar) Hui Wu, Michele Merler, Rosario Uceda-Sosa, John R. Smith Learning to Make Better Mistakes: Semantics-aware Visual Food Recognition ACM Multi Media Conference, 2016.
-
01/12/2016 (Mostafa Kamal Sarker) TBD
-
25/11/2016 (Maria Leyva) Limin Wang, Sheng Guo, Weilin Huang, Yuanjun Xiong, Yu Qiao Knowledge Guided Disambiguation for Large-Scale Scene Classification with Multi-Resolution CNNs. Places365
-
10/11/2016 ( Slides by Javier Castillos Quiroz ) Ethan Fast, William McGrath, Pranav Rajpurkar, Michael Bernstein Augur: Mining Human Behaviors from Fiction to Power Interactive Systems CHI: ACM Conference on Human Factors in Computing Systems, 2016. Honorable Mention.
-
03/11/2016 ( Slides by Alejandro Cartas ) Minghuang Ma, Haoqi Fan and Kris M. Kitani. Going Deeper into First-Person Activity Recognition. from CVPR 2016.
-
27/10/2016 ( Slides by Pedro Herruzo ) Vinay Bettadapura, Edison Thomaz, Aman Parnami, Gregory D. Abowd, Irfan Essa Leveraging Context to Support Automated Food Recognition in Restaurants from WACV 2015.
-
20/10/2016 (Slides from the authors adapted by Marc Bolaños) Fukui, A., Park, D. H., Yang, D., Rohrbach, A., Darrell, T., & Rohrbach, M. Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding., Winner of the Visual Question Answering Challenge at CVPR 2016.
-
13/10/2016 ( Slides by Estefania Tavalera ) Quanzeng You, Jiebo Luo, Hailin Jin, Jianchao Yang Building a Large Scale Dataset for Image Emotion Recognition: The Fine Print and The Benchmark., arxiv, 2016.
-
06/10/2016 (Slides by Maya Aghaei) Jing Wang, Yu Cheng, Rogerio Schmidt Feris Walk and Learn: Facial Attribute Representation Learning from Egocentric Video and Contextual Data, CVPR, 2016.
-
29/09/2016 ( Presented by Gabriel de Oliveira) Albert Gordo, Jon Almazan, Jerome Revaud, Diane Larlus Deep Image Retrieval: Learning global representations for image search, ECCV, 2016.
@ UB Plaça Universitat, Maths school, 2nd floor, Room T1
-
20/07/2016 ( Slides by Marc Bolaños) Marc Bolaños, Álvaro Peris, Francisco Casacuberta and Petia Radeva Deep Neural Networks for Multimodal Learning , CVPR 2016 challenge on VQA.
-
13/07/2016 ( Slides by Alejandro Cartas) Georgia Gkioxar, Ross Girshick, Jitendra Malik Wontextual Action Recognition with R*CNN, ICCV 2015.
-
16/06/2016 ( Slides by Maya Aghaei) Maya Aghaei, Mariella Dimiccoli, Petia Radeva With Whom Do I Interact? Detecting Social Interactions in Egocentric Photo-streams, ICPR 2016.
-
09/06/2016 ( Slides by Gabriel de Oliveira) Gabriel de Oliveira, Alejandro Cartas, Marc Bolaños, Mariella Dimiccoli, Maya Aghaei, Xavi Giró-i-Nieto, Petia Radeva LEMoRe: A Lifelog Engine for Moments Retrieval at the NTCIR-Lifelog LSAT Task, 12th NTCIR Conference, Evaluation of Information Access Technologies, Tokyo, Japan, 2016.
-
18/05/2016 ( Slides by Marc Bolaños) Pingbo Pan, Zhongwen Xu, Yi Yang,Fei Wu,Yueting Zhuang Hierarchical Recurrent Neural Encoder for Video Representation with Application to Captioning, CVPR 2016.
-
11/05/2016 (Slides by Maya Aghaei) Hyun Soo Park and Jianbo Sh. Social Saliency Prediction, CVPR 2015.
-
13/04/2016 (Slides by Estefanía Tavalera) Tao Chen, Damian Borth, Trevor Darrell and Shih-Fu Chang. DeepSentiBank: Visual sentiment concept classification with deep convolutional neural networks, arXiv preprint arXiv:1410.8586 (2014).
-
06/04/2016 (Slides by Alejandro Cartas)Eric Tzeng, Judy Hoffman, Trevor Darrell, Kate Saenko. Simultaneous Deep Transfer Across Domains and Tasks, ICCV 2015.
-
29/03/2016 (Slides by Gabriel de Oliveira) Mattis Paulin, Julien Mairal, Matthijs Douze, Zaid Harchaoui, Florent Perronnin, Cordelia Schmid. Convolutional Patch Representations for Image Retrieval: an Unsupervised Approach, arxiv 2016. [Project page]
-
15/03/2016 (Slides by Marc Bolaños) : Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, Antonio Torralba. Learning Deep Features for Discriminative Localization, CVPR 2016.
-
08/03/2016 (Slides by Cosmin Bercea) : Jeffrey Donahue, Lisa Anne Hendricks, Sergio Guadarrama, Marcus Rohrbach, Subhashini Venugopalan, Kate Saenko, Trevor Darrel. Long-term Recurrent Convolutional Networks for Visual Recognition and Description, CVPR 2015. code
-
01/03/2016 (Slides by Marc Carné) : Xiong, Bo, and Kristen Grauman. "Detecting snap points in egocentric video with a web photo prior."; In Computer Vision–ECCV 2014, pp. 282-298. Springer International Publishing, 2014.
-
23/02/016 (Slides by Maya Aghaei:) Yue-Hei Ng, Joe, Matthew Hausknecht, Sudheendra Vijayanarasimhan, Oriol Vinyals, Rajat Monga, and George Toderici. "Beyond short snippets: Deep networks for video classification." In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4694-4702. 2015. [video] [Google Research post]
-
19/07/2016 (Xavier Giro-i-Nieto): Visin, Francesco, Kyle Kastner, Aaron Courville, Yoshua Bengio, Matteo Matteucci, and Kyunghyun Cho. "ReSeg: A Recurrent Neural Network for Object Segmentation." arXiv preprint arXiv:1511.07053 (2015). [code]]
-
21/06/2016 (Slidecast and Slides by Miriam Bellver): Lu, Yongxi, Tara Javidi, and Svetlana Lazebnik. "Adaptive Object Detection Using Adjacency and Zoom Prediction." CVPR 2016. [code]
-
14/06/2016 (Slidecast and Slides by Alberto Montes) Shou, Zheng, Dongang Wang, and Shih-Fu Chang. "Temporal Action Localization in Untrimmed Videos via Multi-stage CNNs." CVPR 2016 [code]
-
31/05/2016 (Slides and Slidecast by Santi Pascual): Xiong, Caiming, Stephen Merity, and Richard Socher. "Dynamic Memory Networks for Visual and Textual Question Answering." arXiv preprint arXiv:1603.01417 (2016). [discussion]
-
24/05/2016 (Elisa Sayrol): Srinivas S S Kruthiventi, Vennela Gudisa, Jaley H Dholakiya and R. Venkatesh Babu, "Saliency Unified: A Deep Architecture for simultaneous Eye Fixation Prediction and Salient Object Segmentation". In Proceedings of the IEEE International Conference on Computer Vision, 2016.
-
17/05/2016 (Slides by Andrea Ferri): Kai Kang, Hongsheng Li, Junjie Yan, Xingyu Zeng, Bin Yang, Tong Xiao, Cong Zhang, Zhe Wang, Ruohui Wang, Xiaogang Wang, and Wanli Ouyang, T-CNN: Tubelets with Convolutional Neural Networks for Object Detection from Videos, CVPR 2016 [code]
-
10/05/2016 (Slides and Screencast by Issey Masuda): Zhu, Yuke, Oliver Groth, Michael Bernstein, and Li Fei-Fei. "Visual7W: Grounded Question Answering in Images." CVPR 2016.
-
03/05/2016 (Slides and Video): Mohedano E, Salvador A, McGuinness K, Giró-i-Nieto X, O'Connor N, Marqués F. Bags of Local Convolutional Features for Scalable Instance Search. In: ACM International Conference on Multimedia Retrieval (ICMR). New York City, NY; USA: 2016.
-
03/05/2016 (Slides and Video): Salvador A, Giró-i-Nieto X, Marqués F, Satoh S'ichi. Faster R-CNN Features for Instance Search. In: CVPR Workshop Deep Vision. Las Vegas, NV. USA. 2016
-
03/05/2016 (Slides and Screencast by Albert Jiménez]: Gordo, Albert, Jon Almazan, Jerome Revaud, and Diane Larlus. "Deep Image Retrieval: Learning global representations for image search." ECCV 2016.
-
26/04/2016 (Slides by Dèlia Fernàndez): Parikh, Devi, and Kristen Grauman. "Relative attributes." In Computer Vision (ICCV), 2011 IEEE International Conference on, pp. 503-510. IEEE, 2011.
-
12/04/2016 (Screencast and Slides by Alberto Montes): Yao, Li, Atousa Torabi, Kyunghyun Cho, Nicolas Ballas, Christopher Pal, Hugo Larochelle, and Aaron Courville. "Describing videos by exploiting temporal structure." In Proceedings of the IEEE International Conference on Computer Vision, pp. 4507-4515. 2015. [code]
-
29/03/2016 (Video and Slides by Victor Campos] Jaderberg, Max, Karen Simonyan, and Andrew Zisserman. Spatial transformer networks. Advances in Neural Information Processing Systems 2015.
- 15/03/2016 (Slides by Andrea Ferri) Redmon, Joseph, Santosh Divvala, Ross Girshick, and Ali Farhadi. "You only look once: Unified, real-time object detection." arXiv preprint arXiv:1506.02640 (2015). [Project page]
- 01/03/2016 (Slides by Amaia Salvador): Ren, S., He, K., Girshick, R. and Sun, J., 2015. Faster R-CNN: Towards real-time object detection with region proposal networks. In Advances in Neural Information Processing Systems (pp. 91-99). [Python code] [Matlab code]
- 16/02/2016 (Slides by Míriam Bellver): Caicedo, Juan C., and Svetlana Lazebnik. "Active object localization with deep reinforcement learning." In Proceedings of the IEEE International Conference on Computer Vision, pp. 2488-2496. 2015 [related Reddit] [slides from other reading group]
- 15-23/12/2015 (Slides by Marc Carné): Aditya Khosla, Akhil S. Raju, Antonio Torralba and Aude Oliva, "Understanding and Predicting Image Memorability at a Large Scale". International Conference on Computer Vision (ICCV), 2015.
- 01/12/2015 (Slides by Víctor Campos): Takuya Narihira, Damian Borth, Stella X. Yu, Karl Ni, Trevor Darrell, "Mapping Images to Sentiment Adjective Noun Pairs with Factorized Neural Nets".
- 24/11/2015 (Slides by Alejandro Cartas): Sun, Chen, Manohar Paluri, Ronan Collobert, Ram Nevatia, and Lubomir Bourdev. "ProNet: Learning to Propose Object-specific Boxes for Cascaded Neural Networks." arXiv preprint arXiv:1511.03776 (2015).
- 17/11/2015 (Slides by Xavier Giró-i-Nieto): Schroff, Florian, Dmitry Kalenichenko, and James Philbin. "FaceNet: A Unified Embedding for Face Recognition and Clustering." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015. Implementation from OpenFace.
- 10/11/2015 (Slides by Elisa Sayrol): Szegedy, Christian, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. "Going Deeper With Convolutions." In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1-9. 2015.
- 03/11/2015 (Slides by Marc Bolaños): Karpathy, Andrej, and Li Fei-Fei. "Deep visual-semantic alignments for generating image descriptions." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015.
- 27/10/2015 (Slides by Xavier Giró-i-Nieto): Kruthiventi, Srinivas SS, Kumar Ayush, and R. Venkatesh Babu. "DeepFix: A Fully Convolutional Neural Network for predicting Human Eye Fixations." arXiv preprint arXiv:1510.02927 (2015)
- 20/10/2015 (Slides by Albert Jiménez): Castro, D., Hickson, S., Bettadapura, V., Thomaz, E., Abowd, G., Christensen, H., & Essa, I. (2015, September). Predicting daily activities from egocentric images using deep learning. In Proceedings of the 2015 ACM International Symposium on Wearable Computers (pp. 75-82). ACM
- 13/10/2015 (Slides by Miriam Bellver): Hong, S., Noh, H., & Han, B. (2015). Decoupled Deep Neural Network for Semi-supervised Semantic Segmentation. arXiv preprint arXiv:1506.04924.
- 6/10/2015: CANCELLED
- 29/9/2015: CANCELLED
- 22/9/2015 (Slides by Amaia Salvador): Russakovsky, Olga, Li-Jia Li, and Li Fei-Fei. "Best of both worlds: human-machine collaboration for object annotation." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015
- 15/9/2015 (Slides by Victor Campos): Dosovitskiy, Alexey, Jost Tobias Springenberg, and Thomas Brox. "Learning to Generate Chairs With Convolutional Neural Networks." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015. [video]
- 02/07/2015 (Slides by Xavier Giró-i-Nieto) Farabet, Clement, Camille Couprie, Laurent Najman, and Yann LeCun. "Learning hierarchical features for scene labeling." Pattern Analysis and Machine Intelligence, IEEE Transactions on 35, no. 8 (2013): 1915-1929
- 18/06/2015 (Slides by Xavier Giró-i-Nieto) Russakovsky, O., Bearman, A. L., Ferrari, V., & Li, F. F. (2015). What's the point: Semantic segmentation with point supervision. arXiv preprint arXiv:1506.02106
- 08/06/2015 (Slides by Amaia Salvador) Girshick, R. (2015). Fast R-CNN. arXiv preprint arXiv:1504.08083.
- 06/06/2015 (Slides by Amaia Salvador) Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., & Torralba, A. (2014). Object detectors emerge in deep scene CNNs. arXiv preprint arXiv:1412.6856.
- 29/06/2015 (Slides by Eduard Fontdevila) Hariharan, Arbelaez, Girshick, Malik, Simultaneous Detection and Segmentation (ECCV 2014)
- 26/05/2015 (Slides by Marc Bolaños) Kuo, W., Hariharan, B., & Malik, J. (2015). DeepBox: Learning Objectness with Convolutional Networks. arXiv preprint arXiv:1505.02146.
- 21/05/2015 (Slides by Xavier Giró-i-Nieto) Hoffman, J., Guadarrama, S., Tzeng, E. S., Hu, R., Donahue, J., Girshick, R., ... & Saenko, K. (2014). LSDA: Large scale detection through adaptation. In Advances in Neural Information Processing Systems (pp. 3536-3544).
- 28/04/2015 (Slides by Eduard Fontdevila) Chen, L. C., Papandreou, G., Kokkinos, I., Murphy, K., & Yuille, A. L. (2014). Semantic image segmentation with deep convolutional nets and fully connected CRFs. arXiv preprint arXiv:1412.7062.
- 14/04/2015 (Slides by Eduard Fontdevila) Iandola, F., Moskewicz, M., Karayev, S., Girshick, R., Darrell, T., & Keutzer, K. (2014). Densenet: Implementing efficient convnet descriptor pyramids. arXiv preprint arXiv:1404.1869.
- 07/04/2015 (Slides by Xavier Giró-i-Nieto) Zeiler, Matthew D., and Rob Fergus. "Visualizing and understanding convolutional networks." In Computer Vision–ECCV 2014, pp. 818-833. Springer International Publishing, 2014
- 09/03/2015 (Slides by Amaia Salvador) Babenko, Artem, et al. "Neural codes for image retrieval." Computer Vision–ECCV 2014. Springer International Publishing, 2014. 584-599.
- 30/02/2015 (Slides by Junting Pan) Simonyan, Karen, Andrea Vedaldi, and Andrew Zisserman. "Deep inside convolutional networks: Visualising image classification models and saliency maps." arXiv preprint arXiv:1312.6034 (2013)
- 23/02/2015 (Slides by Victor Campos) You, Q., Luo, J., Jin, H., & Yang, J. (2015, September). Robust image sentiment analysis using progressively trained and domain transferred deep networks. In The Twenty-Ninth AAAI Conference on Artificial Intelligence (AAAI).
- 18/02/2015 (Slides by Victor Campos) Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., & Fei-Fei, L. (2014, June). Large-scale video classification with convolutional neural networks. In Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on (pp. 1725-1732). IEEE