Gerasimos Potamianos: Publications and Patents per Research Area / Topic


Published work is clustered into three broad categories. First are listed works on perception technologies for ambient intelligence in smart spaces (since 2004), part of which have been conducted within European Projects CHIL, DICIT, NETCARITY, and DIRHA, as well as work on audio-visual speech processing (since 1997), including work conducted as part of European Union Marie Curie Grant AVISPIRE. This is followed by work in other areas, such as emotion recognition from speech, and text processing/information extraction (conducted as part of European Project SYNC3), as well as earlier research work (prior to 1998) on language modeling, statistical image analysis, and signal processing / filter design.



    A. AUDIO-VISUAL PERCEPTION TECHNOLOGIES FOR AMBIENT INTELLIGENCE IN SMART SPACES

    General:

  1. S.-H. G. Chan, J. Li, P. Frossard, and G. Potamianos, Special Section on Interactive Multimedia, Editorial, IEEE Transactions on Multimedia, vol. 13, no. 5, pp. 841-843, 2011.
  2. A. Waibel, R. Stiefelhagen, R. Carlson, J. Casas, J. Kleindienst, L. Lamel, O. Lanz, D. Mostefa, M. Omologo, F. Pianesi, L. Polymenakos, G. Potamianos, J. Soldatos, G. Sutschet, and J. Terken, Computers in the Human Interaction Loop, Handbook of Ambient Intelligence and Smart Environments, H. Nakashima, H. Aghajan, and J.C. Augusto (Eds.), Part IX, pp. 1071-1116, Springer, 2010.
  3. Far-Field Speech Processing:

  4. L.-H. Kim, M. Hasegawa-Johnson, G. Potamianos, and V. Libal, Joint estimation of DOA and speech based on EM beamforming, Proc. Int. Conf. Acoust. Speech Signal Process. (ICASSP), pp. 121-124, Dallas, TX, 2010.
  5. G. Potamianos, L. Lamel, M. Wolfel, J. Huang, E. Marcheret, C. Barras, J. McDonough, J. Hernando, D. Macho, and C. Nadeu, Automatic Speech Recognition, Computers in the Human Interaction Loop, A. Waibel and R. Stiefelhagen (Eds.), Ch. 6, pp. 43-59, Springer, 2009.
  6. J. Huang, E. Marcheret, K. Visweswariah, and G. Potamianos, The IBM RT07 evaluation systems for speaker diarization on lecture meetings, in Multimodal Technologies for Perception of Humans: International Evaluation Workshops CLEAR 2007 and RT 2007, Baltimore, Maryland, 2007, LNCS vol. 4625, pp. 497-508, Springer, Berlin, 2008.
  7. J. Huang, E. Marcheret, K. Visweswariah, V. Libal, and G. Potamianos, The IBM Rich Transcription Spring 2007 speech-to-text systems for lecture meetings, in Multimodal Technologies for Perception of Humans: International Evaluation Workshops CLEAR 2007 and RT 2007, Baltimore, Maryland, 2007, LNCS vol. 4625, pp. 429-441, Springer, Berlin, 2008.
  8. J. Huang, E. Marcheret, K. Visweswariah, V. Libal, and G. Potamianos, Detection, diarization, and transcription of far-field lecture speech, Proc. Conf. Int. Speech Comm. Assoc. (Interspeech), pp. 2161-2164, Antwerp, Belgium, 2007.
  9. E. Marcheret, G. Potamianos, K. Visweswariah, and J. Huang, The IBM RT06s evaluation system for speech activity detection in CHIL seminars, Proc. RT06s Evaluation Works. - held with Joint Works. on Multimodal Interaction and Related Machine Learning Algorithms (MLMI), S. Renals, S. Bengio, and J.G. Fiscus (Eds.), LNCS 4299, pp. 323-335, Washington DC, 2006.
  10. J. Huang, M. Westphal, S. Chen, O. Siohan, D. Povey, V. Libal, A. Soneiro, H. Schulz, T. Ross, and G. Potamianos, The IBM Rich Transcription Spring 2006 speech-to-text system for lecture meetings, Proc. RT06s Evaluation Works. - held with Joint Works. on Multimodal Interaction and Related Machine Learning Algorithms (MLMI), S. Renals, S. Bengio, and J.G. Fiscus (Eds.), LNCS 4299, pp. 432-443, Washington DC, 2006.
  11. S.M. Chu, E. Marcheret, and G. Potamianos, Automatic speech recognition and speech activity detection in the CHIL smart room, Proc. Joint Works. on Multimodal Interaction and Related Machine Learning Algorithms (MLMI), LNCS vol. 3869, pp. 332-343, Edinburgh, United Kingdom, 2005.
  12. E. Marcheret, K. Visweswariah, and G. Potamianos, Speech activity detection fusing acoustic phonetic and energy features, Proc. Europ. Conf. Speech Comm. Technol. (Interspeech), pp. 241-244, Lisbon, Portugal, 2005.
  13. D. Macho, J. Padrell, A. Abad, C. Nadeu, J. Hernando, J. McDonough, M. Wolfel, U. Klee, M. Omologo, A. Brutti, P. Svaizer, G. Potamianos, and S.M. Chu, Automatic speech activity detection, source localization, and speech recognition on the CHIL seminar corpus, Proc. Int. Conf. Multimedia Expo (ICME), Amsterdam, The Netherlands, 2005.
  14. Conversational Interaction:

  15. G. Potamianos, J. Huang, E. Marcheret, V. Libal, R. Balchandran, M. Epstein, L. Seredi, M. Labsky, L. Ures, M. Black, and P. Lucey, Far-field multimodal speech processing and conversational interaction in smart spaces, Proc. Joint Work. Hands-Free Speech Communication and Microphone Arrays (HSCMA), Trento, Italy, 2008.
  16. R. Balchandran, M. Epstein, G. Potamianos, and L. Seredi, A multi-modal spoken dialog system for interactive TV, Proc. Int. Conf. Multimodal Interfaces (ICMI) - Demo Papers, pp. 191-192, Chania, Greece, 2008.
  17. Acoustic / Multimodal Scene Analysis for Event Detection / Activity Classification:

  18. V. Libal, B. Ramabhadran, N. Mana, F. Pianesi, P. Chippendale, O. Lanz, and G. Potamianos, Multimodal classification of activities of daily living inside smart homes, Proc. Int. Works. Ambient Assisted Living (IWAAL), LNCS vol. 5518, Part II, pp. 687-694, Salamanca, Spain, 2009.
  19. X. Zhuang, J. Huang, G. Potamianos, and M. Hasegawa-Johnson, Acoustic fall detection using Gaussian mixture models and GMM supervectors, Proc. Int. Conf. Acoust. Speech Signal Process. (ICASSP), pp. 69-72, Taipei, Taiwan, 2009.
  20. J. Huang, X. Zhuang, V. Libal, and G. Potamianos, Long-time span acoustic activity analysis from far-field sensors in smart homes, Proc. Int. Conf. Acoust. Speech Signal Process. (ICASSP), pp. 4173-4176, Taipei, Taiwan, 2009.
  21. A Leone, G. Diraco, C. Distante, P. Siciliano, M. Malfatti, L. Gonzo, M. Grassi, A. Lombardi, G. Rescio, P. Malcovati, V. Libal, J. Huang, and G. Potamianos, A multi-sensor approach for people fall detection in home environment, Proc. Work. Multi-Camera and Multi-Modal Sensor Fusion (M2SFA2), Held in Conjunction with the: 10th Europ. Conf. Computer Vision (ECCV), Marseille, France, 2008.
  22. M. Grassi, A Leone, M. Malfatti, A. Lombardi, G. Rescio, G. Diraco, C. Distante, V. Libal, J. Huang, G. Potamianos, P. Malcovati, L. Gonzo, and P. Siciliano, A hardware-software framework for high-reliability people fall detection, Proc. 7th IEEE Conf. on Sensors (SENSORS), pp. 1328-1331, Lecce, Italy, 2008.
  23. Computer Vision:

  24. K. Bernardin, R. Stiefelhagen, A. Pmevmatikakis, O. Lanz, A. Brutti, J. Casas, and G. Potamianos, Person Tracking, Computers in the Human Interaction Loop, A. Waibel and R. Stiefelhagen (Eds.), Ch. 3, pp. 11-22, Springer, 2009.
  25. A. Tyagi, J.W. Davis, and G. Potamianos, Steepest descent for efficient covariance tracking, Proc. IEEE Work. Motion and Video Computing (WMVC), Copper Mountain, Colorado, 2008.
  26. Z. Zhang, G. Potamianos, A.W. Senior, and T.S. Huang, Joint face and head tracking inside multi-camera smart rooms, Signal, Image and Video Processing, vol. 1, pp. 163-178, 2007.
  27. A. Tyagi, M. Keck, J.W. Davis, and G. Potamianos, Kernel-based 3D tracking, Proc. IEEE Int. Work. Visual Surveillance (VS/CVPR), Minneapolis, Minnesota, 2007.
  28. A. Tyagi, G. Potamianos, J.W. Davis, and S.M. Chu, Fusion of multiple camera views for kernel-based 3D tracking, Proc. IEEE Works. Motion and Video Computing (WMVC), Austin, Texas, 2007.
  29. G. Potamianos and Z. Zhang, A joint system for single-person 2D-face and 3D-head tracking in CHIL seminars, Proc. CLEAR Evaluation Works., LNCS vol. 4122, Southampton, United Kingdom, 2006.
  30. Z. Zhang, G. Potamianos, S.M. Chu, J. Tu, and T.S. Huang, Person tracking in smart rooms using dynamic programming and adaptive subspace learning, Proc. Int. Conf. Multimedia Expo. (ICME), pp. 2061-2064, Toronto, Canada, 2006.
  31. A.W. Senior, G. Potamianos, S. Chu, Z. Zhang, and A. Hampapur, A comparison of multicamera person-tracking algorithms, Proc. IEEE Int. Works. Visual Surveillance (VS/ECCV), Graz, Austria, 2006.
  32. Z. Zhang, G. Potamianos, M. Liu, and T. Huang, Robust multi-view multi-camera face detection inside smart rooms using spatio-temporal dynamic programming, Proc. Int. Conf. Automatic Face and Gesture Recog. (FGR), Southampton, United Kingdom, 2006.
  33. Z. Zhang, G. Potamianos, A. Senior, S. Chu, and T. Huang, A joint system for person tracking and face detection, Proc. Int. Works. Human-Computer Interaction (ICCV 2005 Works. on HCI), pp. 47-59, Beijing, China, 2005.
  34. Corpora:

  35. D. Mostefa, N. Moreau, K. Choukri, G. Potamianos, S.M. Chu, A. Tyagi, J.R. Casas, J. Turmo, L. Christoforetti, F. Tobia, A. Pnevmatikakis, V. Mylonakis, F. Talantzis, S. Burger, R. Stiefelhagen, K. Bernardin, and C. Rochet, The CHIL audiovisual corpus for lecture and meeting analysis inside smart rooms, Journal of Language Resources and Evaluation, vol. 41, pp. 389-407, 2008.
  36. B. AUDIO-VISUAL SPEECH PROCESSING

    Overview Works on Audio-Visual Speech Processing and Recognition:

  37. G. Potamianos, C. Neti, J. Luettin, and I. Matthews, Audio-Visual Automatic Speech Recognition: An Overview, (To Appear In:) Audio-Visual Speech Processing, E. Vatikiotis-Bateson, G. Bailly, and P. Perrier (Eds.), Ch. 9, Cambridge University Press, 2012.
  38. H. Meng, S. Oviatt, G. Potamianos, and G. Rigoll, Introduction to the Special Issue on Multimodal Processing in Speech-Based Interactions, Editorial, IEEE Transactions on Audio, Speech, and Language Processing, vol. 17, no. 3, pp. 409-410, 2009.
  39. G. Potamianos, Audio-Visual Speech Recognition, Short Article, Encyclopedia of Language and Linguistics, Second Edition, (Speech Technology Section - Computer Understanding of Speech), K. Brown (Ed. In Chief), Elsevier, Oxford, United Kingdom, ISBN: 0-08-044299-4, 2006.
  40. P.S. Aleksic, G. Potamianos, and A.K. Katsaggelos, Exploiting Visual Information in Automatic Speech Processing, In: Handbook of Image and Video Processing, Second Edition, Al. Bovic (Ed.), ch. 10.8, pp. 1263-1289, Elsevier Academic Press, Burlington, MA, ISBN: 0-12-119792-1, 2005.
  41. G. Potamianos, C. Neti, G. Gravier, A. Garg, and A.W. Senior, Recent advances in the automatic recognition of audio-visual speech, Invited, Proceedings of the IEEE, vol. 91, no. 9, pp. 1306-1326, 2003.
  42. G. Potamianos, C. Neti, and S. Deligne, Joint audio-visual speech processing for recognition and enhancement, Proc. Works. Audio-Visual Speech Process., pp. 95-104, St. Jorioz, France, 2003.
  43. C. Neti, G. Potamianos, J. Luettin, and E. Vatikiotis-Bateson, Editorial of the Special Issue on Joint Audio-Visual Speech Processing, EURASIP Journal on Applied Signal Processing, vol. 2002, no. 11, 2002.
  44. C. Neti, G. Iyengar, G. Potamianos, A. Senior, and B. Maison, Perceptual interfaces for information interaction: Joint processing of audio and visual information for human-computer interaction, Proc. Int. Conf. Spoken Language Process. (ICSLP), vol. III, pp. 11-14, Beijing, China, 2000.
  45. Audio-Visual Speech Recognition Systems, Experiments, Data:

  46. G. Galatas, G. Potamianos, and F. Makedon, Audio-visual speech recognition using depth information from the Kinect in noisy video conditions, (Submitted To:) Int. Conf. Pervasive Technologies Related to Assistive Environments (PETRA), Crete, Greece, 2012.
  47. G. Galatas, G. Potamianos, and F. Makedon, Audio-visual speech recognition incorporating facial depth information captured by the Kinect, (Submitted To:) Europ. Conf. Signal Process. (EUSIPCO), Bucharest, Romania, 2012.
  48. G. Galatas, G. Potamianos, D. Kosmopoulos, C. McMurrough, and F. Makedon, Bilingual corpus for AVASR using multiple sensors and depth information, Proc. Int. Conf. Auditory-Visual Speech Process. (AVSP), pp. 103-106, Volterra, Italy, 2011.
  49. J. Huang, G. Potamianos, J. Connell, and C. Neti. Audio-visual speech recognition using an infrared headset, Speech Communication, vol. 44, no. 4, pp. 83-96, 2004.
  50. G. Potamianos, C. Neti, J. Huang, J.H. Connell, S. Chu, V. Libal, E. Marcheret, N. Haas, and J. Jiang, Towards practical deployment of audio-visual speech recognition, Invited, Proc. Int. Conf. Acoust. Speech Signal Process. (ICASSP), vol. 3, pp. 777-780, Montreal, Canada, 2004.
  51. J. Huang, G. Potamianos, and C. Neti, Improving audio-visual speech recognition with an infrared headset, Proc. Works. Audio-Visual Speech Process. (AVSP), pp. 175-178, St. Jorioz, France, 2003.
  52. G. Potamianos and C. Neti, Audio-visual speech recognition in challenging environments, Proc. Eur. Conf. Speech Comm. Tech. (Eurospeech), pp. 1293-1296, Geneva, Switzerland, 2003.
  53. J.H. Connell, N. Haas, E. Marcheret, C. Neti, G. Potamianos, and S. Velipasalar, A real-time prototype for small-vocabulary audio-visual ASR, Proc. Int. Conf. Multimedia Expo (ICME), vol. II, pp. 469-472, Baltimore, MD, 2003.
  54. C. Neti, G. Potamianos, J. Luettin, I. Matthews, H. Glotin, and D. Vergyri, Large-vocabulary audio-visual speech recognition: A summary of the Johns Hopkins Summer 2000 Workshop, Proc. Works. Multimedia Signal Process. (MMSP), pp. 619-624, Cannes, France, 2001.
  55. G. Potamianos, C. Neti, G. Iyengar, and E. Helmuth, Large-vocabulary audio-visual speech recognition by machines and humans, Proc. Europ. Conf. Speech Comm. Technol. (Eurospeech), pp. 1027-1030, Aalborg, Denmark, 2001.
  56. G. Potamianos and C. Neti, Automatic speechreading of impaired speech, Proc. Works. Audio-Visual Speech Process. (AVSP), pp. 177-182, Aalborg, Denmark, 2001.
  57. C. Neti, G. Potamianos, J. Luettin, I. Matthews, H. Glotin, D. Vergyri, J. Sison, A. Mashari, and J. Zhou, Audio-Visual Speech Recognition, Final Workshop 2000 Report, Center for Language and Speech Processing, The Johns Hopkins University, Baltimore, 2000.
  58. G. Potamianos and A. Potamianos, Speaker adaptation for audio-visual speech recognition, Proc. Europ. Speech Comm. Technol. (Eurospeech), vol. 3, pp. 1291-1294, Budapest, Hungary, 1999.
  59. G. Potamianos, E. Cosatto, H.P. Graf, and D.B. Roe, Speaker independent audio-visual database for bimodal ASR, Proc. Europ. Tutorial Research Work. Audio-Visual Speech Process. (AVSP), pp. 65-68, Rhodes, Greece, 1997.
  60. Audio-Visual Fusion:

  61. S.M. Chu, V. Goel, E. Marcheret, and G. Potamianos, Method for Likelihood Computation in Multi-Stream HMM Based Speech Recognition, Patent No.: US007480617B2, Jan. 20, 2009.
  62. J.H. Connell, N. Haas, E. Marcheret, C.V. Neti, and G. Potamianos, Audio-Only Backoff in Audio-Visual Speech Recognition System, Patent No.: US007251603B2, July 31, 2007.
  63. E. Marcheret, V. Libal, and G. Potamianos, Dynamic stream weight modeling for audio-visual speech recognition, Proc. Int. Conf. Acoust. Speech Signal Process. (ICASSP), vol. 4, pp. 945-948, Honolulu, HI, 2007.
  64. E. Marcheret, S.M. Chu, V. Goel, and G. Potamianos, Efficient likelihood computation in multi-stream HMM based audio-visual speech recognition, Proc. Int. Conf. Spoken Lang. Process. (ICSLP), Jeju Island, Korea, 2004.
  65. S.M. Chu, V. Libal, E. Marcheret, C. Neti, and G. Potamianos, Multistage information fusion for audio-visual speech recognition, Proc. Int. Conf. Multimedia Expo (ICME), Taipei, Taiwan, 2004.
  66. A. Garg, G. Potamianos, C. Neti, and T.S. Huang, Frame-dependent multi-stream reliability indicators for audio-visual speech recognition, Proc. Int. Conf. Acoust. Speech Signal Process. (ICASSP), vol. I, pp. 24-27, Hong Kong, China, 2003.
  67. G. Gravier, G. Potamianos, and C. Neti, Asynchrony modeling for audio-visual speech recognition, Proc. Human Language Technology Conference (HLT), pp. 1-6, San Diego, CA, 2002.
  68. G. Gravier, S. Axelrod, G. Potamianos, and C. Neti, Maximum entropy and MCE based HMM stream weight estimation for audio-visual ASR, Proc. Int. Conf. Acoust. Speech Signal Process. (ICASSP), pp. 853-856, Orlando, FL, 2002.
  69. G. Potamianos, J. Luettin, and C. Neti, Hierarchical discriminant features for audio-visual LVCSR, Proc. Int. Conf. Acoust. Speech Signal Process. (ICASSP), vol. 1, pp. 165-168, Salt Lake City, UT, 2001.
  70. J. Luettin, G. Potamianos, and C. Neti, Asynchronous stream modeling for large-vocabulary audio-visual speech recognition, Proc. Int. Conf. Acoust. Speech Signal Process. (ICASSP), vol. 1, pp. 169-172, Salt Lake City, UT, 2001.
  71. H. Glotin, D. Vergyri, C. Neti, G. Potamianos, and J. Luettin, Weighting schemes for audio-visual fusion in speech recognition, Proc. Int. Conf. Acoust. Speech Signal Process. (ICASSP), vol. 1, pp. 173-176, Salt Lake City, UT, 2001.
  72. G. Potamianos and C. Neti, Stream confidence estimation for audio-visual speech recognition, Proc. Int. Conf. Spoken Language Process. (ICSLP), vol. III, pp. 746-749, Beijing, China, 2000.
  73. G. Potamianos and H.P. Graf, Discriminative training of HMM stream exponents for audio-visual speech recognition, Proc. Int. Conf. Acoust. Speech Signal Process. (ICASSP), vol. 6, pp. 3733-3736, Seattle, WA, 1998.
  74. Non-Frontal AVASR:

  75. P. Lucey, G. Potamianos, and S. Sridharan, Visual Speech Recognition Across Multiple Views, Visual Speech Recognition: Lip Segmentation and Mapping, A. Wee-Chung Liew and S. Wang (Eds.), Ch. X, pp. 294-325, Information Science Publishing Press, 2009.
  76. P. Lucey, G. Potamianos, and S. Sridharan, Patch-based analysis of visual speech from multiple views, Proc. Int. Conf. Auditory-Visual Speech Process. (AVSP), pp. 69-73, Tangalooma, Australia, 2008.
  77. P. Lucey, G. Potamianos, and S. Sridharan, A unified approach to multi-pose audio-visual ASR, Proc. Conf. Int. Speech Comm. Assoc. (Interspeech), pp. 650-653, Antwerp, Belgium, 2007.
  78. P. Lucey, G. Potamianos, and S. Sridharan, An extended pose-invariant lipreading system, Proc. Work. Audio-Visual Speech Process. (AVSP), pp. 176-180, Hilvarenbeek, The Netherlands, 2007.
  79. G. Potamianos and P. Lucey, Audio-visual ASR from multiple views inside smart rooms, Proc. Int. Conf. Multisensor Fusion and Integration for Intelligent Systems (MFI), pp. 35-40, Heidelberg, Germany, 2006.
  80. P. Lucey and G. Potamianos, Lipreading using profile versus frontal views, Proc. Works. Multimedia Signal Process. (MMSP), pp. 24-28, Victoria, Canada, 2006.
  81. Other Audio-Visual Speech Technologies:

  82. K. Kumar, G. Potamianos, J. Navratil, E. Marcheret, and V. Libal, Audio-Visual Speech Synchrony Detection by a Family of Bimodal Linear Prediction Models, Multibiometrics for Human Identification, B. Bhanu and V. Govindaraju (Eds.), Ch. 2, pp. 31-50, Cambridge University Press, 2011.
  83. K. Kumar, J. Navratil, E. Marcheret, V. Libal, and G. Potamianos, Robust audio-visual speech synchrony detection by generalized bimodal linear prediction, Proc. Conf. Int. Speech Comm. Assoc. (Interspeech), Brighton, United Kingdom, 2009.
  84. K. Kumar, J. Navratil, E. Marcheret, V. Libal, G. Ramaswamy, and G. Potamianos, Audio-visual speech synchronization detection using a bimodal linear prediction model, Proc. IEEE Comp. Soc. Works. Biometrics, Held in Association with CVPR, Miami Beach, Florida, 2009.
  85. S. Deligne, C.V. Neti, and G. Potamianos, Audio-Visual Codebook Dependent Cepstral Normalization, Patent No.: US007319955B2, Jan. 15, 2008.
  86. U.V. Chaudhari, C. Neti, G. Potamianos, and G.N. Ramaswamy, Automated Decision Making Using Time-Varying Stream Reliability Prediction, Patent No.: US007228279B2, June 5, 2007.
  87. V. Libal, J. Connell, G. Potamianos, and E. Marcheret, An embedded system for in-vehicle visual speech activity detection, Int. Work. Multimedia Signal Process. (MMSP), pp. 255-258, Chania, Greece, 2007.
  88. P. de Cuetos, G.R. Iyengar, C.V. Neti, and G. Potamianos, System and Method for Microphone Activation Using Visual Speech Cues, Patent No.: US006754373B1, June 22, 2004.
  89. E. Cosatto, H.P. Graf, G. Potamianos, and J. Schroeter, Audio-Visual Selection Process for the Synthesis of Photo-Realistic Talking-Head Animations, Patent No.: US006654018B1, Nov. 25, 2003.
  90. U.V. Chaudhari, G.N. Ramaswamy, G. Potamianos, and C. Neti, Information fusion and decision cascading for audio-visual speaker recognition based on time varying stream reliability prediction, Proc. Int. Conf. Multimedia Expo (ICME), vol. III, pp. 9-12, Baltimore, MD, July 2003.
  91. U.V. Chaudhari, G.N. Ramaswamy, G. Potamianos, and C. Neti, Audio-visual speaker recognition using time-varying stream reliability prediction, Proc. Int. Conf. Acoust. Speech Signal Process. (ICASSP), vol. V, pp. 712-715, Hong Kong, China, 2003.
  92. S. Deligne, G. Potamianos, and C. Neti, Audio-visual speech enhancement with AVCDCN (audio-visual codebook dependent cepstral normalization), Int. Conf. Spoken Lang. Process., pp. 1449-1452, Denver, CO, 2002.
  93. R. Goecke, G. Potamianos, and C. Neti, Noisy audio feature enhancement using audio-visual speech data, Proc. Int. Conf. Acoust. Speech Signal Process. (ICASSP), pp. 2025-2028, Orlando, FL, 2002
  94. E. Cosatto, G. Potamianos, and H.P. Graf, Audio-visual unit selection for the synthesis of photo-realistic talking-heads, Proc. Int. Conf. Multimedia Expo (ICME), vol. II, pp. 619-622, New York, 2000.
  95. Visual Feature Extraction:

  96. G. Galatas, G. Potamianos, A. Papangelis, and F. Makedon, Audio visual speech recognition in noisy visual environments, Proc. Int. Conf. Pervasive Technologies Related to Assistive Environments (PETRA), Crete, Greece, 2011.
  97. G. Potamianos and P. Scanlon, Exploiting lower face symmetry in appearance-based automatic speechreading, Proc. Works. Audio-Visual Speech Process. (AVSP), pp. 79-84, Vancouver Island, Canada, 2005.
  98. P. Scanlon, G. Potamianos, V. Libal, and S.M. Chu, Mutual information based visual feature selection for lipreading, Proc. Int. Conf. Spoken Lang. Process. (ICSLP), pp. Jeju Island, Korea, 2004.
  99. G. Potamianos, C. Neti, G. Iyengar, A.W. Senior, and A. Verma, A cascade visual front end for speaker independent automatic speechreading, Int. J. Speech Technology, vol. 4, pp. 193-208, 2001.
  100. G. Potamianos and C. Neti, Improved ROI and within frame discriminant features for lipreading, Proc. Int. Conf. Image Process. (ICIP), vol. III, pp. 250-253, Thessaloniki, Greece, 2001.
  101. G. Iyengar, G. Potamianos, C. Neti, T. Faruquie, and A. Verma, Robust detection of visual ROI for automatic speechreading, Proc. Works. Multimedia Signal Process. (MMSP), pp. 79-84, Cannes, France, 2001.
  102. I. Matthews, G. Potamianos, C. Neti, and J. Luettin, A comparison of model and transform-based visual features for audio-visual LVCSR, Proc. Int. Conf. Multimedia Expo (ICME), Tokyo, Japan, 2001.
  103. G. Potamianos, A. Verma, C. Neti, G. Iyengar, and S. Basu, A cascade image transform for speaker independent automatic speechreading, Proc. Int. Conf. Multimedia Expo (ICME), vol. II, pp. 1097-1100, New York, NY, 2000.
  104. G. Potamianos and H.P. Graf, Linear discriminant analysis for speechreading, Proc. Works. Multimedia Signal Process., pp. 221-226, Los Angeles, CA, 1998.
  105. G. Potamianos, H.P. Graf, and E. Cosatto, An image transform approach for HMM based automatic lipreading, Proc. Int. Conf. Image Process. (ICIP), vol. III, pp. 173-177, Chicago, IL, 1998.
  106. Face Detection and Tracking:

  107. J. Jiang, G. Potamianos, and G. Iyengar, Improved face finding in visually challenging environments, Proc. Int. Conf. Multimedia Expo (ICME), Amsterdam, The Netherlands, 2005.
  108. J. Jiang, G. Potamianos, H. Nock, G. Iyengar, and C. Neti, Improved face and feature finding for audio-visual speech recognition in visually challenging environments, Proc. Int. Conf. Acoust. Speech Signal Process. (ICASSP), vol. 5, pp. 873-876, Montreal, Canada, 2004.
  109. E. Cosatto, H.P. Graf, and G. Potamianos, Robust Multi-Modal Method for Recognizing Objects, Patent No.: US006118887A, Sep. 12, 2000.
  110. H.P. Graf, E. Cosatto, and G. Potamianos, Machine vision of faces and facial features, Proc. R.I.E.C. Int. Symp. Design Archit. Inform. Process. Systems Based Brain Inform. Princ., pp. 48-53, Sendai, Japan, 1998.
  111. H.P. Graf, E. Cosatto, and G. Potamianos, Robust recognition of faces and facial features with a multi-modal system, Proc. Int. Conf. Systems Man Cybern. (SMC), pp. 2034-2039, Orlando, FL, 1997.
  112. C. OTHER WORK

    Emotion Recognition from Speech:

  113. P. Giannoulis and G. Potamianos, A hierarchical approach with feature selection for emotion recognition from speech, (To Appear In:) Proc. Int. Conf. Language Resources and Evaluation (LREC), Istanbul, Turkey, 2012.
  114. Text Processing / Information Extraction:

  115. N. Sarris, G. Potamianos, J.-M. Renders, C. Grover, E. Karstens, L. Kallipolitis, V. Tountopoulos, G. Petasis, A. Krithara, M. Galle, G. Jacquet, B. Alex, R. Tobin, and L. Bounegru, A system for synergistically structuring news content from traditional media and the blogosphere, Proc. E-Challenges Conference, Florence, Italy, 2011.
  116. Statistical Language Modeling:

  117. G. Potamianos and F. Jelinek, A study of n-gram and decision tree letter language modeling methods, Speech Communication, vol. 24, no. 3, pp. 171-192, 1998.
  118. Statistical Image Analysis - Markov Random Fields:

  119. G. Potamianos and J. Goutsias, Stochastic approximation algorithms for partition function estimation of Gibbs random fields, IEEE Transactions on Information Theory, vol. 43, no. 6, pp. 1948-1965, 1997.
  120. G. Potamianos, Efficient Monte Carlo estimation of partition function ratios of Markov random fields, Proc. Conf. Inform. Sci. Systems (CISS), vol. II, pp. 1212-1215, Princeton, NJ, 1996.
  121. G. Potamianos and J. Goutsias, A unified approach to Monte Carlo likelihood estimation of Gibbs random field images, Proc. Conf. Inform. Sci. Systems (CISS), vol. I, pp. 84-90, Princeton, NJ, 1994.
  122. G. Potamianos, Stochastic Simulation Algorithms for Partition Function Estimation of Markov Random Field Images, Ph.D. Thesis, Department of Electrical and Computer Engineering, The Johns Hopkins University, Baltimore, MD, 1994.
  123. G. Potamianos and J. Goutsias, Partition function estimation of Gibbs random field images using Monte Carlo simulations, IEEE Transactions on Information Theory, vol. 39, no. 4, pp. 1322-1332, 1993.
  124. G. Potamianos and J. Goutsias, An analysis of Monte Carlo methods for likelihood estimation of Gibbsian images, Proc. Int. Conf. Acoust. Speech Signal Process. (ICASSP), vol. V, pp. 519-522, Minneapolis, MN, 1993.
  125. G. Potamianos and J. Goutsias, On computing the likelihood function of partially observed Markov random field images using Monte Carlo simulations, Proc. Conf. Inform. Sci. Systems (CISS), vol. I, pp. 357-362, Princeton, NJ, 1992.
  126. G. Potamianos and J. Goutsias, A novel method for computing the partition function of Markov random field images using Monte Carlo simulations, Proc. Int. Conf. Acoust. Speech Signal Process. (ICASSP), vol. 4, pp. 2325-2328, Toronto, Canada, 1991.
  127. Signal Processing - Filter Design:

  128. G. Potamianos and J. Diamessis, Frequency sampling design of 2-D IIR filters using continued fractions, Proc. Int. Symp. Circuits Systems (ISCAS), vol. 3, pp. 2454-2457, New Orleans, LA, 1990.
  129. J. Diamessis and G. Potamianos, A novel method for designing IIR filters with nonuniform samples, Proc. Conf. Inform. Sci. Systems (CISS), vol. 1, pp. 192-195, Princeton, NJ, 1990.
  130. J. Diamessis and G. Potamianos, Modeling unequally spaced 2-D discrete signals by rational functions, Proc. Int. Symp. Circuits Systems (ISCAS), vol. 2, pp. 1508-1511, Portland, OR, 1989.
  131. G. Potamianos, Design of 2-D Digital Filters Using Continued Fractions, Diploma Thesis, Department of Electrical and Computer Engineering, National Technical University of Athens, Athens, Greece, 1988.

Last Updated on April 20, 2012


Back to Prof. Potamianos' Home Page