Abstract
There is a growing conviction that the future of computing will crucially depend on our ability to better exploit data to produce more intelligent systems. Increasingly, this will involve drawing simultaneously on multiple heterogeneous modalities, to take full advantage of the vast quantities of images and videos now available on the Web and elsewhere. We give several examples of methods that leverage prior knowledge for better, more semantically informed visual analytics, as well as methods that use multimodal data for better textual analytics. Important progress may come from approaches specifically geared towards harvesting rich multimodal knowledge. For example, our Knowlywood system relies on Hollywood movies to learn about human activities. Once acquired, knowledge of this sort can then be re-used across different tasks, much like humans draw on their accumulated knowledge when making sense of the world.
- ANTOL, S., AGRAWAL, A., LU, J., MITCHELL, M., BATRA, D., ZITNICK, C. L., AND PARIKH, D. 2015. VQA: Visual question answering. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). Google ScholarDigital Library
- CHEN, J., TANDON, N., AND GERARD DE MELO. 2015. Neural word representations from large-scale commonsense knowledge. In Proceedings of WI 2015. Google ScholarDigital Library
- DE MELO, G. AND WEIKUM, G. 2010. Providing multilingual, multimodal answers to lexical database queries. In Proceedings of the 7th Language Resources and Evaluation Conference (LREC 2010). ELRA, Paris, France, 348--355.Google Scholar
- DE MELO, G. AND WEIKUM, G. 2014. Taxonomic data integration from multilingual Wikipedia editions. Knowledge and Information Systems 39, 1 (April), 1--39.Google ScholarDigital Library
- DENG, J., DONG, W., SOCHER, R., LI, L., LI, K., AND LI, F. 2009. ImageNet: A large-scale hierarchical image database. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society, 248--255.Google ScholarCross Ref
- GAN, C., LIN, M., YANG, Y., DE MELO, G., AND HAUPTMANN, A. G. 2016. Concepts not alone: Exploring pairwise relationships for zero-shot video activity recognition. In Proceedings of the 30th AAAI Conference on Artificial Intelligence (AAAI 2016). AAAI Press.Google ScholarCross Ref
- GE, T., WANG, Y., DE MELO, G., SHARF, A., AND CHEN, B. 2016. ShapeExplorer: Querying and exploring shapes using visual knowledge. In Proceedings of EDBT 2016.Google Scholar
- HOFFART, J., SUCHANEK, F. M., BERBERICH, K., LEWIS-KELHAM, E., DE MELO, G., AND WEIKUM, G. 2011. YAGO2: Exploring and querying world knowledge in time, space, context, and many languages. In Proceedings of the 20th International World Wide Web Conference (WWW 2011), S. Srinivasan, K. Ramamritham, A. Kumar, M. P. Ravindra, E. Bertino, and R. Kumar, Eds. ACM, New York, NY, USA, 229--232. Google ScholarDigital Library
- KRISHNA, R., ZHU, Y., GROTH, O., JOHNSON, J., HATA, K., KRAVITZ, J., CHEN, S., KALANDITIS, Y., LI, L.-J., SHAMMA, D. A., BERNSTEIN, M., AND FEI-FEI, L. 2016. Visual Genome: Connecting language and vision using crowdsourced dense image annotations.Google Scholar
- MARCUS, G. 2014. What Comes After the Turing Test? The New Yorker, June 9, 2014.Google Scholar
- ROHRBACH, A., ROHRBACH, M., TANDON, N., AND SCHIELE, B. 2015. A dataset for movie description. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarCross Ref
- SHUTOVA, E., TANDON, N., AND DE MELO, G. 2015. Perceptually grounded selectional preferences. In Proceedings of ACL 2015. 950--960.Google ScholarCross Ref
- TANDON, N., DE MELO, G., DE, A., AND WEIKUM, G. 2015. Knowlywood: Mining activity knowledge from Hollywood narratives. In Proceedings of CIKM 2015. Google ScholarDigital Library
- TANDON, N., DE MELO, G., SUCHANEK, F. M., AND WEIKUM, G. 2014. WebChild: Harvesting and organizing commonsense knowledge from the web. In Proceedings of ACM WSDM 2014. 523--532. Google ScholarDigital Library
- TANDON, N., DE MELO, G., AND WEIKUM, G. 2011. Deriving a Web-scale common sense fact database. In Proceedings of the Twenty-fifth AAAI Conference on Artificial Intelligence (AAAI 2011). AAAI Press, Palo Alto, CA, USA, 152--157. Google ScholarDigital Library
- TANDON, N., DE MELO, G., AND WEIKUM, G. 2014. Acquiring comparative commonsense knowledge from the web. In Proceedings of the 28th AAAI Conference on Artificial Intelligence (AAAI 2014). AAAI, 166--172. Google ScholarDigital Library
- TANDON, N., HARIMAN, C., URBANI, J., ROHRBACH, A., ROHRBACH, M., AND WEIKUM, G. 2016. Commonsense in parts: Mining part-whole relations from the web and image tags. In Proceedings of the 30th AAAI Conference on Artificial Intelligence (AAAI 2016).Google ScholarCross Ref
- TAYLOR, A., MARCUS, M., AND SANTORINI, B. 2003. Treebanks: Building and Using Parsed Corpora. Springer Netherlands, Dordrecht, Chapter The Penn Treebank: An Overview, 5--22.Google Scholar
- THOMEE, B., ELIZALDE, B., SHAMMA, D. A., NI, K., FRIEDLAND, G., POLAND, D., BORTH, D., AND LI, L.-J. 2016. YFCC100M: The new data in multimedia research. Commun. ACM 59, 2 (Jan.), 64--73. Google ScholarDigital Library
- VENUGOPALAN, S., ROHRBACH, M., DONAHUE, J., MOONEY, R., DARRELL, T., AND SAENKO, K. 2015. Sequence to sequence - video to text. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). Google ScholarDigital Library
- VINYALS, O., KAISER, L. U., KOO, T., PETROV, S., SUTSKEVER, I., AND HINTON, G. 2015. Grammar as a foreign language. In Advances in Neural Information Processing Systems 28, C. Cortes, N. Lawrence, D. Lee, M. Sugiyama, and R. Garnett, Eds. Curran Associates, Inc., 2755--2763.Google Scholar
Recommendations
"A new age of search systems" by Ujwal Gadiraju with Martin Vesely as coordinator: understanding knowledge gained by users through search
More than half of the world's population has access to the Internet today. Satisfying one's information need has never been easier and more ubiquitous. We can catch ourselves turning to web search with regularity to find information, answer questions, ...
"Applying fuzzy ontologies to implement the social semantic web" by Edy Portmann, Patrick Kaltenrieder and Noémie Zurlinden with Martin Vesely as coordinator
Because the knowledge in the World Wide Web is continuously expanding, Web Knowledge Aggregation, Representation and Reasoning (abbreviated as KR) is becoming increasingly important. This article demonstrates how fuzzy ontologies can be used in KR to ...
Seeing is believing
Health InformaticsWhy visualization will play a critical role in bringing big data decision making to a hospital bed near you.
Comments