IM2 Publications

[1] N. Li, O. Mubin, F. Kaplan, and P. Dilllenbourg. A tabletop environment for augmenting meetings with background search. Under peer review for the ITS2011 conference, Kobe, Japan, . [ bib ]
[2] L. Goldmann, A. Samour, T. Ebrahimi, and T. Sikora. Multimodal person search combining information fusion and relevance feedback. In IEEE International Workshop on Multimedia Signal Processing (MMSP 2009), . [ bib | http | Abstract ]
[3] F. De Simone, M. Naccari, M. Tagliasacchi, F. Dufaux, S. Tubaro, and T. Ebrahimi. Subjective assessment of h.264/avc video sequences transmitted over a noisy channel. In First International Workshop on Quality of Multimedia Experience (QoMEX 2009), . [ bib | http | Abstract ]
[4] J. S. Lee, F. De Simone, and T. Ebrahimi. Influence of audio-visual attention on perceived quality of standard definition multimedia content. In First International Workshop on Quality of Multimedia Experience (QoMEX 2009), . [ bib | www: | Abstract ]
[5] J. S. Lee and T. Ebrahimi. Two-level bimodal association for audio-visual speech recognition. In International Conference on Advanced Concepts for Intelligent Vision Systems (ACIVSâ09), . [ bib | Abstract ]
[6] D. Gatica-Perez and J. M. Odobez. Visual attention, speaking activity, and group conversational analysis in multi-sensor environments. In H. Nakashima, J. Augusto, H. Aghajan (Eds.), Handbook of Ambient Intelligence and Smart Environments, Springer, in press, . [ bib ]
[7] A. Popescu-Belis. Multimodal database annotation formats and standards, software architecture for multimodal interfaces. In J. Ph. Thiran, H. Bourlard, and F. Marques, editors, Multimodal Signal Processing: Methods and Techniques to Build Multimodal Interactive Systems. Academic Press, . in press. [ bib ]
[8] E. Mugellini, D. Lalanne, B. Dumas, F. Evéquoz, S. Gerardi, A. Le Calvé, A. Boder, R. Ingold, and O. Khaled. Memodules as tangible shortcuts to multimedia information. . [ bib ]
[9] D. Gatica-Perez. Modeling interest in face-to-face conversations from multimodal nonverbal behavior. In In J.-P. Thiran, H. Bourlard, and F. Marques, (Eds.), Multimodal Signal Processing, Academic Press, in press, . [ bib ]
[10] D. Brodbeck, R. Mazza, and D. Lalanne. Interactive visualization - a survey. . [ bib ]
[11] B. Noris, K. Benmachiche, and A. Billard. Calibration-free eye gaze direction detection with gaussian processes. In International Conference on Computer Vision Theory and Applications (VISAPP 08), . [ bib ]
[12] B. Dumas, D. Lalanne, and S. Oviatt. Multimodal interfaces: a survey of principles, models and frameworks. . [ bib ]
[13] P. Motlicek, H. Hermansky, H. Garudadri, and N. Srinivasamurthy. Audio coding based on long temporal contexts. IDIAP-RR 30, IDIAP, 2006. [ bib | .ps.gz | .pdf | Abstract ]
[14] D. Zhang, D. Gatica-Perez, and S. Bengio. Exploring contextual information in a layered framework for group action recognition. In In the Eighth International Conference on Multimodal Interfaces (ICMI'06), 2006. IDIAP-RR 06-41. [ bib | .ps.gz | .pdf | Abstract ]
[15] D. Barber and S. Chiappa. Unified inference for variational bayesian linear gaussian state-space models. In NIPS, 2006. IDIAP-RR 06-50. [ bib | .ps.gz | .pdf | Abstract ]
[16] H. Ketabdar and H. Hermansky. Identifying unexpected words using in-context and out-of-context phoneme posteriors. IDIAP-RR 68, IDIAP, 2006. [ bib | .ps.gz | .pdf | Abstract ]
[17] A. Just. Two-handed gestures for human-computer interaction. Idiap-rr, École Polytechnique Fédérale de Lausanne, 2006. PhD Thesis #3683 at the École Polytechnique Fédérale de Lausanne. [ bib | .ps.gz | .pdf | Abstract ]
[18] L. Pérez-Freire, F. Pérez-González, and S. Voloshynovskiy. An accurate analysis of scalar quantization-based data hiding. IEEE Trans. on Information Forensics and Security, 1(1):80–86, 2006. [ bib | .pdf ]
[19] S. Chiappa. Analysis and classification of eeg signals using probabilistic models for brain computer interfaces. PhD thesis, École Polytechnique Fédérale de Lausanne, 2006. [ bib | .ps.gz | .pdf ]
[20] R. Bertolami, B. Halter, and H. Bunke. Combination of multiple handwritten text line recognition systems with a recursive approach. In Proc. 10th Int. Workshop Frontiers in Handwriting Recognition, pages 61–65, 2006. [ bib ]
[21] S. Voloshynovskiy, O. Koval, M. K. Mihcak, and T. Pun. The edge process model and its application to information hiding capacity analysis. IEEE Trans. on Signal Processing, 54(5):1813–1825, 2006. [ bib | .pdf ]
[22] G. Andreani, G. Di Fabbrizio, M. Gilbert, D. Gillick, D. Hakkani-Tur, and O. Lemon. Lets discoh: Collecting an annotated open corpus with dialog acts and reward signals for natural language helpdesks. Proc. IEEE/ACL Workshop on Spoken Language Technology, 2006. [ bib ]
[23] P. Motlicek, V. Ullal, and H. Hermansky. Wide-band perceptual audio coding based on frequency-domain linear prediction. IDIAP-RR 58, IDIAP, 2006. [ bib | .ps.gz | .pdf | Abstract ]
[24] J. Richiardi and A. Drygajlo. Applying biometrics to identity documents: implementation issues. Snsf ambai project technical report, Swiss Federal Institute of Technology, 2006. [ bib ]
[25] J. Luo, A. Pronobis, and B. Caputo. Svm-based transfer of visual knowledge across robotic platforms. IDIAP-RR 65, IDIAP, 2006. [ bib | .ps.gz | .pdf | Abstract ]
[26] B. Leibe, N. Cornelis, K. Cornelis, and L. van Gool. Integrating recognition and reconstruction for cognitive traffic scene analysis from a moving vehicle. In DAGM Annual Pattern Recognition Symposium, volume 4174 of LNCS, pages 192–201. Springer, 2006. [ bib ]
[27] A. Schlapbach and H. Bunke. Off-line writer verification: a comparison of a hidden markov model (hmm) and a gaussian mixture model (gmm) based system. In Proc. 10th Int. Workshop Frontiers in Handwriting Recognition, pages 275–280, 2006. [ bib ]
[28] S. Ba and J. M. Odobez. Recognizing people's focus of attention from head poses: a study. IDIAP-RR 42, IDIAP, 2006. [ bib | .ps.gz | .pdf | Abstract ]
[29] S. Cuendet, D. Hakkani-Tur, and G. Tur. Model adaptation for sentence segmentation from speech. Proc. IEEE/ACL Workshop on Spoken Language Technology,, 2006. [ bib ]
[30] S. Marcel, Y. Rodriguez, M. Guillemot, and A. Popescu-Belis. Annotation of face detection: description of xml format and files. IDIAP-COM 06, IDIAP, 2006. [ bib | .ps.gz | .pdf ]
[31] Y. Rodriguez. Face detection and verification using local binary patterns. Idiap-rr, École Polytechnique Fédérale de Lausanne, 2006. PhD Thesis #3681 at the École Polytechnique Fédérale de Lausanne. [ bib | .ps.gz | .pdf | Abstract ]
[32] D. Hillard, Z. Huang, H. Ji, R. Grishman, D. Hakkani-Tur, M. Harper, M. Ostendorf, and W. Wang. Impact of automatic comma prediction on pos/name tagging of speech. Proc. IEEE/ACL Workshop on Spoken Language Technology,, 2006. [ bib ]
[33] M. Everingham, A. Zisserman, C. Williams, L. van Gool, M. Allan, C. Bishop, O. Chapelle, N. Dalal, T. Deselaers, G. Dorko, S. Duffner, J. Eichhorn, J. Farquhar, M. Fritz, C. Garcia, T. Griffiths, F. Jurie, D. Keysers, M. Koskela, J. Laaksonen, D. Larlus, B. Leibe, H. Meng, H. Ney, B. Schiele, C. Schmid, E. Seemann, J. Shawe-Taylor, A. Storkey, S. Szedmak, B. Triggs, I. Ulusoy, V. Viitaniemi, and J. Zhang. The 2005 pascal visual object class challenge. In Selected Proceedings of the 1st PASCAL Challenges Workshop, Lecture Notes in AI. Springer, 2006. [ bib ]
[34] P. Wey, B. Fischer, H. Bay, and J. M. Buhmann. Dense stereo by triangular meshing and cross validation. In DAGM-Symposium, pages 708–717, 2006. [ bib ]
[35] P. Müller, P. Wonka, S. Haegler, A. Ulmer, and L. van Gool. Procedural modeling of buildings. In Proceedings of ACM SIGGRAPH 2006 / ACM Transactions on Graphics, volume 25, pages 614–623. ACM Press, 2006. [ bib ]
[36] D. Zhang. Probabilistic graphical models for human interaction analysis. PhD thesis, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland, 2006. thesis # (IDIAP-RR 06-78). [ bib | .ps.gz | .pdf | Abstract ]
[37] E. L. Torre, B. Caputo, and T. Tommasi. Melanoma recognition using kernel classifiers. IDIAP-RR 53, IDIAP, 2006. [ bib | .ps.gz | .pdf | Abstract ]
[38] M. Keller. Machine learning approaches to text representation using unlabeled data. PhD thesis, Ecole Polytechnique Fédérale de Lausanne, 2006. IDIAP-RR 06-76. [ bib | .ps.gz | .pdf | Abstract ]
[39] P. Quelhas and J. M. Odobez. Natural scene image modeling using color and texture visterms. In Conference on Image and Video Retrieval CIVR, 2006. IDIAP-RR 06-17. [ bib | .ps.gz | .pdf | Abstract ]
[40] B. Mesot and D. Barber. Switching linear dynamical systems for noise robust speech recognition. IDIAP-RR 08, IDIAP, 2006. [ bib | .ps.gz | .pdf | Abstract ]
[41] J. Vepa and S. King. Subjective evaluation of join cost and smoothing methods for unit selection speech synthesis. IEEE Trans. on Audio, Speech and Language Processing, 14(5):1763–1771, 2006. IDIAP-RR 05-34. [ bib | .ps.gz | .pdf | Abstract ]
[42] M. Müller, F. Evéquoz, and D. Lalanne. Tjass, a smart board for augmenting card game playing and learning (demo). In Symposium on User Interface Software and Technology (UIST 2006), pages 67–68, Montreux (Switzerland), 2006. [ bib ]
[43] S. Voloshynovskiy, O. Koval, E. Topak, J. E. V. Forcen, and T. Pun. On reversibility of random binning based data-hiding techniques: security perspectives. In ACM Multimedia and Security Workshop 2006, Geneva, Switzerland, 2006. [ bib | .ps ]
[44] N. Moüenne-Loccoz, B. Janvier, S. Marchand-Maillet, and E. Bruno. Handling temporal heterogeneous data for content-based management of large video collections. Multimedia Tools and Applications, 31:309–325, 2006. [ bib ]
[45] J. Richiardi and A. Drygajlo. Applying biometrics to identity documents: estimating and coping with errors. Snsf ambai project technical report, Swiss Federal Institute of Technology, 2006. [ bib ]
[46] T. Spindler, C. Wartmann, D. Roth, A. Steffen, L. Hovestadt, and L. van Gool. Privacy in video surveilled areas. In International Conference on Privacy, Security and Trust (PST 2006), 2006. [ bib ]
[47] B. Leibe, K. Mikolajczyk, and B. Schiele. Efficient clustering and matching for object class recognition. In British Machine Vision Conference (BMVC, 2006. [ bib ]
[48] B. Leibe, K. Mikolajczyk, and B. Schiele. Segmentation based multi-cue integration for object detection. In British Machine Vision Conference (BMVC, 2006. [ bib ]
[49] G. Lathoud. Observations on multi-band asynchrony in distant speech recordings. IDIAP-RR 74, IDIAP, Martigny, Switzerland, 2006. [ bib | .ps.gz | .pdf | Abstract ]
[50] G. Lathoud, M. Magimai-Doss, and H. Bourlard. Unsupervised spectral subtraction for noise-robust asr on unknown transmission channels. IDIAP-RR 09, IDIAP, Martigny, Switzerland, 2006. [ bib | .ps.gz | .pdf | Abstract ]
[51] O. Cheng, J. Dines, and M. Magimai-Doss. A generalized dynamic composition algorithm of weighted finite state transducers for large vocabulary speech recognition. IDIAP-RR 62, IDIAP, 2006. Submitted for publication. [ bib | .ps.gz | .pdf | Abstract ]
[52] J. E. Vila-Forcén, S. Voloshynovskiy, O. Koval, and T. Pun. Facial image compression based on structured codebooks in overcomplete domain. EURASIP Journal on Applied Signal Processing, Frames and overcomplete representations in signal processing, communications, and information theory special issue, 2006(Article ID 69042):1–11, 2006. [ bib | .pdf ]
[53] J. E. Vila-Forcén, S. Voloshynovskiy, O. Koval, and T. Pun. Costa problem under channel ambiguity. In Proceedings of 2006 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Toulouse, France, 2006. [ bib | .pdf ]
[54] M. F. BenZeghiba and H. Bourlard. User-customized password speaker verification using multiple reference and background models. Speech Communication, 8:1200–1213, 2006. IDIAP-RR 04-41. [ bib | .ps.gz | .pdf | Abstract ]
[55] G. Lathoud. Spatio-temporal analysis of spontaneous speech with microphone arrays. PhD thesis, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland, 2006. PhD Thesis #3689 at the École Polytechnique Fédérale de Lausanne (IDIAP-RR 06-77). [ bib | .ps.gz | .pdf | Abstract ]
[56] G. Chanel, J. Kronegg, D. Grandjean, and T. Pun. Emotion assessment: arousal evaluation using eeg's and peripheral physiological signals. In B. Gunsel, A. K. Jain, A. M. Tekalp, and B. Sankur, editors, Proc. Int. Workshop Multimedia Content Representation, Classification and Security (MRCS), volume 4105, pages 530–537, Istanbul, Turkey, 2006. Lecture Notes in Computer Science, Springer. [ bib ]
[57] M. Keller and S. Bengio. A multitask learning approach to document representation using unlabeled data. IDIAP-RR 44, IDIAP, 2006. [ bib | .ps.gz | .pdf | Abstract ]
[58] B. Mesot and D. Barber. A bayesian alternative to gain adaptation in autoregressive hidden markov models. IDIAP-RR 55, IDIAP, 2006. [ bib | .ps.gz | .pdf | Abstract ]
[59] M. Melichar, P. Cenek, M. Ailomaa, A. Lisowska, and M. Rajman. From vocal to multimodal dialogue management. In Eighth International Conference on Multimodal Interfaces (ICMI'06), 2006. [ bib ]
[60] N. Poh and S. Bengio. Estimating the confidence interval of expected performance curve in biometric authentication using joint bootstrap. IDIAP-RR 25, IDIAP, 2006. Submitted for publication. [ bib | .ps.gz | .pdf | Abstract ]
[61] M. Liwicki and H. Bunke. Hmm-based on-line recognition of handwritten whiteboard notes. In Proceedings 10th International Workshop Frontiers in Handwriting Recognition, pages 595–599, 2006. [ bib ]
[62] S. Marcel, J. Keomany, and Y. Rodriguez. Robust-to-illumination face localisation using active shape models and local binary patterns. IDIAP-RR 47, IDIAP, 2006. Submitted for publication. [ bib | .ps.gz | .pdf | Abstract ]
[63] K. Smith, S. Schreiber, V. Beran, I. Potúcek, G. Rigoll, and D. Gatica-Perez. Multi-person tracking in meetings: a comparative study. In Multimodal Interaction and Related Machine Learning Algorithms (MLMI), 2006. IDIAP-RR 06-38. [ bib | .ps.gz | .pdf | Abstract ]
[64] K. Smith, S. Ba, J. M. Odobez, and D. Gatica-Perez. Tracking attention for multiple people: wandering visual focus of attention estimation. IDIAP-RR 40, IDIAP, 2006. Submitted for publication. [ bib | .ps.gz | .pdf | Abstract ]
[65] S. Ba and J. M. Odobez. A study on visual focus of attention recognition from head pose in a meeting room. In 3rd Joint Workshop on Multimodal Interaction and Related Machine Learning Algorithms (MLMI06), 2006. IDIAP-RR 06-10. [ bib | .ps.gz | .pdf | Abstract ]
[66] J. del R. Millán, F. Renkens, J. Mouriño, and W. Gerstner. Non-invasive brain-actuated control of a mobile robot by human eeg. In 2006 IMIA Yearbook of Medical Informatics. Schattauer Verlag, 2006. [ bib | Abstract ]
[67] A. Hannani, D. Toledano, D. Petrovska, A. Montero-Asenjo, and J. Hennebert. Using data-driven and phonetic units for speaker verification. In IEEE Speaker and Language Recognition Workshop (Odyssey 2006), Puerto Rico, 2006. [ bib ]
[68] N. Poh and S. Bengio. Using chimeric users to construct fusion classifiers in biometric authentication tasks: an investigation. In IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), 2006. IDIAP-RR 05-59. [ bib | .ps.gz | .pdf | Abstract ]
[69] P. C. Cattin, H. Bay, L. van Gool, and G. Székely. Retina mosaicing using local features. In Medical Image Computing and Computer-Assisted Intervention (MICCAI), volume 4191 of LNCS, pages 185–192, 2006. [ bib ]
[70] J. Mariéthoz. Discrmininant models for text-independent speaker verification. IDIAP-RR 70, IDIAP, 2006. [ bib | .ps.gz | .pdf | Abstract ]
[71] T. Pun, T. I. Alecu, G. Chanel, J. Kronegg, and S. Voloshynovskiy. Brain-computer interaction research at the computer vision and multimedia laboratory, university of geneva. IEEE Trans. Neural Systems and Rehabilitation Engineering, Special Issue on Brain-Computer Interaction, 14(2):210–213, 2006. [ bib ]
[72] C. Hemptinne. Master thesis: integration of the harmonic plus noise model (hnm) into the hidden markov model-based speech synthesis system (hts). IDIAP-RR 69, IDIAP, 2006. [ bib | .ps.gz | .pdf ]
[73] S. Kosinov, S. Marchand-Maillet, I. Kozintsev, C. Dulong, and T. Pun. Dual diffusion model of spreading activation for content-based image retrieval. In 8th ACM SIGMM - International Workshop on Multimedia Information Retrieval, Santa Barbara, CA, USA, 2006. [ bib ]
[74] H. K. Maganti, P. Motlicek, and D. Gatica-Perez. Unsupervised speech/non-speech detection for automatic speech recognition in meeting rooms. IDIAP-RR 57, IDIAP, Martigny, Switzerland, 2006. [ bib | .ps.gz | .pdf | Abstract ]
[75] B. Janvier, E. Bruno, S. Marchand-Maillet, and T. Pun. Handling temporal heterogeneous data for content-based management of large video collections. Multimedia Tools and Applications, 30:273–288, 2006. [ bib ]
[76] A. Peregoudov, A. Vinciarelli, and H. Bourlard. Assessing the effectiveness of slides as a mean to improve the automatic transcription of oral presentations. IDIAP-RR 56, IDIAP, 2006. Submitted for publication. [ bib | .ps.gz | .pdf | Abstract ]
[77] S. Cuendet. Model adaptation for sentence unit segmentation from speech. IDIAP-RR 64, IDIAP, Martigny, Switzerland, 2006. [ bib | .ps.gz | .pdf | Abstract ]
[78] V. Ullal and P. Motlicek. Audio coding based on long temporal segments: experiments with quantization of excitation signal. IDIAP-RR 46, IDIAP, 2006. [ bib | .ps.gz | .pdf | Abstract ]
[79] N. Poh. Multi-system biometric authentication: optimal fusion and user-specific information. PhD thesis, École Polytechnique Fédérale de Lausanne, 2006. [ bib | .ps.gz | .pdf | Abstract ]
[80] F. Mendels, J. Ph. Thiran, and P. Vandergheynst. Matching pursuit-based shape representation and recognition using scale-space. International Journal of Imaging Systems and Technology, 6(15):162–180, 2006. [ bib | DOI | http ]
[81] A. Janin, A. Stolcke, X. Anguera, K. Boakye, O. Cetin, J. Frankel, and J. Zheng. The icsi-sri spring 2006 meeting evaluation system. In S. Renals and S. Bengio, editors, Machine Learning for Multimodal Interaction: Third International Workshop (MLMI 2006); Lecture Notes in Computer Science. Springer, 2006. [ bib ]
[82] A. Buttfield and J. del R. Millán. Online classifier adaptation in brain-computer interfaces. IDIAP-RR 16, IDIAP, 2006. [ bib | .ps.gz | .pdf | Abstract ]
[83] O. Koval, S. Voloshynovskiy, T. Holotyak, and T. Pun. Information-theoretic analysis of steganalysis in real images. In ACM Multimedia and Security Workshop 2006, Geneva, Switzerland, 2006. [ bib | .ps ]
[84] H. Chiquet, F. Evéquoz, and D. Lalanne. Elcano, a tangible multimedia browser (demo). In Symposium on User Interface Software and Technology (UIST 2006), pages 51–52, Montreux (Switzerland), 2006. [ bib ]
[85] J. Luo, A. Pronobis, B. Caputo, and P. Jensfelt. Incremental learning for place recognition in dynamic environments. IDIAP-RR 52, IDIAP, 2006. [ bib | .ps.gz | .pdf | Abstract ]
[86] R. Rienks, D. Zhang, D. Gatica-Perez, and W. Post. Detection and application of influence rankings in small group meetings. In ICMI '06: Proceedings of the 8th international conference on Multimodal interfaces, pages 257–264, New York, NY, USA, 2006. ACM Press. [ bib | DOI ]
[87] G. Tur, U. Guz, and D. Hakkani-Tur. Model adaptation for dialog act tagging. Proc. IEEE/ACL Workshop on Spoken Language Technology, 2006. [ bib ]
[88] M. Radgohar, F. Evéquoz, and D. Lalanne. Phong, augmenting virtual and real gaming experience (demo). In Symposium on User Interface Software and Technology (UIST 2006), pages 71–72, Montreux (Switzerland), 2006. [ bib ]
[89] C. Dimitrakakis. Ensembles for sequence learning. PhD thesis, École Polytechnique Fédérale de Lausanne, 2006. [ bib | .ps.gz | .pdf | Abstract ]
[90] A. Buttfield, P. W. Ferrez, and J. del R. Millán. Towards a robust bci: error potentials and online learning. IEEE Trans. on Neural Systems and Rehabilitation Engineering, 14(2):164–168, 2006. [ bib | .pdf | Abstract ]
[91] T. I. Alecu, S. Voloshynovskiy, and T. Pun. The gaussian transform of distributions: definition, computation and application. IEEE Trans. on Signal Processing, 54(8):2976–2995, 2006. [ bib ]
[92] D. Moore. The juicer lvcsr decoder - user manual for juicer version 0.5.0. IDIAP-COM 03, IDIAP, 2006. [ bib | .ps.gz | .pdf | Abstract ]
[93] A. Pozdnoukhov. Prior knowledge in kernel methods. PhD thesis, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland, 2006. PhD Thesis #3606 at the École Polytechnique Fédérale de Lausanne (IDIAP-RR 06-66). [ bib | .ps.gz | .pdf | Abstract ]
[94] G. Heusch and S. Marcel. A novel statistical generative model dedicated to face recognition. Idiap-RR Idiap-RR-39-2007, IDIAP, 2007. [ bib | Abstract ]
[95] P. Motlicek, S. Ganapathy, H. Hermansky, and H. Garudadri. Scalable wide-band audio codec based on frequency domain linear prediction. IDIAP-RR 16, IDIAP, 2007. [ bib | .ps.gz | .pdf | Abstract ]
[96] H. Hung, D. Jayagopi, C. Yeo, G. Friedland, S. Ba, J. M. Odobez, K. Ramchandran, N. Mirghafori, and D. Gatica-Perez. Using audio and video features to classify the most dominant person in meetings. Proceedings of ACM Multimedia 2007, pp. 835-838, Augsburg, Germany, 2007. [ bib ]
[97] F. Orabona, C. Castellini, B. Caputo, J. Luo, and G. Sandini. Indoor place recognition using online independent support vector machines. In 18th British Machine Vision Conference (BMVC07), pages 1090–1099, 2007. [ bib | Abstract ]
[98] P. Bouillon, M. Rayner, B. Novellas Vall, M. Starlander, M. Santaholma, Y. Nakao, and N. Chatzichrisafis. Une grammaire partagée multi-tâche pour le traitement de la parole : application aux langues romanes. TAL (Traitement Automatique des Langues), 47(3), 2007. [ bib ]
[99] H. Paugam-Moisy, R. Martinez, and S. Bengio. A supervised learning approach based on stdp and polychronization in spiking neuron networks. In European Symposium on Artificial Neural Networks, ESANN, 2007. IDIAP-RR 06-54. [ bib | .ps.gz | .pdf | Abstract ]
[100] A. Vinciarelli and S. Favre. Role recognition in radio programs using social affiliation networks and mixtures of discrete distributions: an approach inspired by social cognition. Idiap-RR Idiap-RR-40-2007, IDIAP, 2007. Submitted for publication. [ bib | Abstract ]
[101] S. Marcel. Joint bi-modal face and speaker authentication using explicit polynomial expansion. IDIAP-RR 14, IDIAP, 2007. Submitted for publication. [ bib | .ps.gz | .pdf ]
[102] E. Szekely, E. Bruno, and S. Marchand-Maillet. Clustered multidimensional scaling for exploration in information retrieval. In International Conference on the Theory of Information Retrieval, Bucarest, HU, 2007. submitted. [ bib ]
[103] H. Bunke and T. Varga. Off-line roman cursive handwriting recognition. Digital Document Processing: Major Directions and Recent Advances, 20:165–173, 2007. [ bib ]
[104] M. Levit, D. Hakkani-Tur, G. Tur, and D. Gillick. Integrating several annotation layers for statistical information distillation. IEEE workshop on Automatic Speech Recognition and Understanding (ASRU 07), Kyoto, 2007. [ bib ]
[105] J. Kronegg, G. Chanel, S. Voloshynovskiy, and T. Pun. Eeg-based synchronized brain-computer interfaces: a model for optimizing the number of mental tasks. IEEE Trans. on Neural Systems and Rehabilitation Engineering, 15(1):50–58, 2007. [ bib ]
[106] E. Shriberg. Higher level features in speaker recognition. In C. Muller, editor, Speaker Classification I. Lecture Notes in Computer Science, Springer, 2007. [ bib ]
[107] A. Lisowska, M. Betrancourt, S. Armstrong, and M. Rajman. Minimizing modality bias when exploring input preference for multimodal systems in new domains: the archivus case study. In CHI' 07, 2007. [ bib ]
[108] M. Huijbregts, C. Wooters, and R. Ordelman. Filtering the unknown: Speech activity detection in heterogeneous video collections. to appear in Proceedings of Interspeech, Antwerp, 2007. [ bib ]
[109] J. Dines and M. Magimai-Doss. A study of phoneme and grapheme based context-dependent asr systems. IDIAP-RR 12, IDIAP, 2007. [ bib | .ps.gz | .pdf | Abstract ]
[110] J. Dines and J. Vepa. Direct optimisation of a multilayer perceptron for the estimation of cepstral mean and variance statistics. IDIAP-RR 13, IDIAP, 2007. [ bib | .ps.gz | .pdf | Abstract ]
[111] C. Gaudard, G. Aradilla, and H. Bourlard. Speech recognition based on template matching and phone posterior probabilities. IDIAP-COM 02, IDIAP, 2007. [ bib | .ps.gz | .pdf ]
[112] F. Valente, J. Vepa, and H. Hermansky. Multi-stream features combination based on dempster-shafer rule for lvcsr system. In Interspeech 2007, 2007. IDIAP-RR 07-09. [ bib | .ps.gz | .pdf | Abstract ]
[113] P. W. Ferrez and J. del R. Millán. Error-related eeg potentials in brain-computer interfaces. In G. Dornhege, J. del R. Millán, T. Hinterberger, D. McFarland, and K. R. Müller, editors, Towards Brain-Computer Interfacing. The MIT Press, 2007. [ bib | Abstract ]
[114] H. Bunke, P. Dickinson, A. Humm, C. Irniger, and M. Kraetzl. Graph sequence visualisation and its application to computer network monitoring and abnormal event detection. In A. Kandel, H. Bunke, and M. Last, editors, Applied Graph Theory in Computer Vision and Pattern Recognition, pages 227–245. Springer, 2007. [ bib ]
[115] H. Hung, D. Jayagopi, C. Yeo, G. Friedland, S. Ba, J. M. Odobez, K. Ramchandran, N. Mirghafori, and D. Gatica-Perez. Using audio and video features to classify the most dominant person in a group meeting. 2007. IDIAP-RR 07-29. [ bib | Abstract ]
[116] A. Lisowska, S. Armstrong, M. Melichar, M. Ailomaa, and M. Rajman. The wizard of oz meets multimodal language-enabled gui interfaces: new challenges. In Proceedings of CHI' 07, Beyond Current User Research: Designing Methods for New Users, T, 2007. [ bib ]
[117] W. Li and H. Bourlard. Non-linear spectral stretching for in-car speech recognition. In Interspeech, 2007. [ bib ]
[118] S. Renals, T. Hain, and H. Bourlard. Recognition and understanding of meetings the ami and amida projects. In Proc. of the IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU'07, pages 238–247, 2007. IDIAP-RR 07-46. [ bib | DOI | Abstract ]
[119] J. Kittler, N. Poh, O. Fatukasi, K. Messer, K. Kryszczuk, J. Richiardi, and A. Drygajlo. Quality dependent fusion of intramodal and multimodal biometric experts. In Proc. SPIE Defense and Security Symposium, Orlando, USA, 2007. [ bib ]
[120] D. Morrison, S. Marchand-Maillet, and E. Bruno. Hierarchical long-term learning for automatic image annotation. In Proceedings 2nd International Conference on Semantic and Digital Media Technologies, Genova, Italy, 2007. [ bib ]
[121] F. Lüthy, T. Varga, and H. Bunke. Using hidden markov models as a tool for handwritten text line segmentation. In Proc. 9th Int. Conf. on Document Analysis and Recognition, pages 8–12, 2007. [ bib ]
[122] R. Chavarriaga, P. W. Ferrez, and J. del R. Millán. To err is human: learning from error potentials in brain-computer interfaces. In 1st International Conference on Cognitive Neurodynamics (ICCN 2007), 2007. IDIAP-RR 07-37. [ bib | .ps.gz | .pdf | Abstract ]
[123] A. Drygajlo. Multimodal biometrics for identity documents and smart cards european challenge. In Proc. 15th European Signal Processing Conf. (EUSIPCO), Poznan, Poland, 2007. (invited paper). [ bib ]
[124] S. Marcel and J. del R. Millán. Person authentication using brainwaves (eeg) and maximum a posteriori model adaptation. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE Special Issue on Biometrics, 2007. IDIAP-RR 05-81. [ bib | .ps.gz | .pdf | Abstract ]
[125] E. Kokiopoulou and P. Frossard. Image alignment with rotation manifolds built on sparse geometric expansions. In IEEE International Workshop on Multimedia Signal Processing, 2007. [ bib | http ]
[126] M. Sorci, G. Antonini, and J. Ph. Thiran. Fisher's discriminant and relevant component analysis for static facial expression classification. In 15th European Signal Processing Conference (EUSIPCO), Poznan, Poland, Poznan, Poland, 2007. ITS. [ bib | http ]
[127] U. Guz, S. Cuendet, D. Hakkani-Tur, and G. Tur. Co-training using prosodic and lexical information for sentence segmentation. to appear in Proceedings of Interspeech, Antwerp, 2007. [ bib ]
[128] B. Noris, K. Benmachiche, J. Meynet, J. Ph. Thiran, and A. Billard. Analysis of head mounted wireless camera videos for early diagnosis of autism. In International Conference on Recognition Systems, 2007. [ bib | http ]
[129] J. Richiardi, K. Kryszczuk, and A. Drygajlo. Quality measures in unimodal and multimodal biometric verification. In Proc. 15th European Signal Processing Conf. (EUSIPCO), Poznan, Poland, 2007. (invited paper). [ bib ]
[130] M. Liwicki, E. Indermühle, and H. Bunke. On-line handwritten text line detection using dynamic programming. In Proc. 9th Int. Conf. on Document Analysis and Recognition, pages 447–451, 2007. [ bib ]
[131] A. Popescu-Belis and S. Zufferey. Contrasting the automatic identification of two discourse markers in multiparty dialogues. In Proceedings of SIGDIAL 2007, 8th SIGdial Workshop on Discourse and Dialogue, page 10, 2007. [ bib ]
[132] S. Cuendet, D. Hakkani-Tur, E. Shriberg, J. Fung, and B. Favre. Cross-genre feature comparisons for spoken sentence segmentation. International Conference on Semantic Computing (ICSC), Irvine, CA, 2007. [ bib ]
[133] X. Perrin, R. Chavarriaga, R. Siegwart, and J. del R. Millán. Bayesian controller for a novel semi-autonomous navigation concept. In 3rd European Conference on Mobile Robots (ECMR 2007), 2007. IDIAP-RR 07-26. [ bib | .ps.gz | .pdf | Abstract ]
[134] K. Livescu, O. Cetin, M. Hasegawa-Johnson, S. King, C. Bartels, N. Borges, A. Kantor, P. Lal, L. Yung, A. Bezman, S. Dawson-Haggerty, B. Woods, J. Frankel, M. Magimai-Doss, and K. Saenko. Articulatory feature-based methods for acoustic and audio-visual speech recognition: Summary from the 2006 jhu summer workshop. Proc. ICASSP, Honolulu, 2007. [ bib ]
[135] W. Li, J. Dines, and M. Magimai-Doss. Robust overlapping speech recognition based on neural networks. Idiap-RR Idiap-RR-55-2007, IDIAP, 2007. [ bib | Abstract ]
[136] M. Liwicki and H. Bunke. Feature selection for on-line handwriting recognition of whiteboard notes. In Proc. 13th Conf. of the Graphonomics Society, pages 101–105, 2007. [ bib ]
[137] R. Bertolami and H. Bunke. Multiple classifier methods for offline handwritten text line recognition. In M. Haindl, J. Kittler, and F. Roli, editors, Multiple Classifier Systems, volume 4472 of Lecture Notes in Computer Science, pages 72–81. Springer, 2007. [ bib ]
[138] A. Humm, J. Hennebert, and R. Ingold. Hidden markov models for spoken signature verification. 2007. [ bib ]
[139] M. Huijbregts and C. Wooters. The blame game: Performance analysis of speaker diarization system components. to appear in Proc. Interspeech, Antwerp., 2007. [ bib ]
[140] F. Cincotti, L. Kauhanen, and F. Aloise. Vibrotactile feedback for brain-computer interface operation. Computational Intelligence and Neuroscience, 2007:Article ID, 2007. doi:10.1155/2007/48937. [ bib ]
[141] P. Bouillon, G. Flores, M. Starlander, N. Chatzichrisafis, M. Santaholma, N. Tsourakis, M. Rayner, and B. A. Hockey. A bidirectional grammar-based medical speech translator. In Proceedings of workshop on Grammar-based approaches to spoken language processing, pages 41–48. ACL 2007, 2007. [ bib ]
[142] K. Ansari-Asl, G. Chanel, and T. Pun. A channel selection method for eeg classification in emotion assessment based on synchronization likelihoo. In Eusipco 2007, 15th Eur. Signal Proc. Conf., Poznan, Poland, 2007. [ bib ]
[143] J. Kolar, Y. Liu, and E. Shriberg. Speaker adaptation of language models for automatic dialog act segmentation of meetings. to appear in Proceedings of Interspeech, Antwerp., 2007. [ bib ]
[144] F. Einsele, J. Hennebert, and R. Ingold. Towards identification of very low resolution, anti-aliased characters. In IEEE International Symposium on Signal Processing and its Applications (ISSPA'07), Sharjah, United Arab Emirates, 2007. [ bib ]
[145] M. Neuhaus and H. Bunke. A quadratic programming approach to the graph edit distance problem. In F. Escolano and M. Vento, editors, Graph-Based Representations in Pattern Recognition, volume 4538 of Lecture Notes in Computer Science, pages 92–102. Springer, 2007. [ bib ]
[146] K. Livescu, A. Bezman, N. Borges, L. Yung, O. Cetin, J. Frankel, S. King, M. Magimai-Doss, X. Chi, and L. Lavoie. Manual transcription of conversational speech at the articulatory feature level. Proc. ICASSP, Honolulu, 2007. [ bib ]
[147] K. Kumatani, H. Mayer, T. Gehrig, E. Stoimenov, J. McDonough, and M. Wölfel. Minimum mutual information beamforming for simultaneous active speakers. In IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU), number Idiap-RR-73-2007, pages 71–76, 2007. [ bib | DOI | Abstract ]
[148] K. Kumatani, H. Mayer, T. Gehrig, E. Stoimenov, J. McDonough, and M. Wölfel. Adaptive beamforming with a minimum mutual information criterion. volume 15, pages 2527—2541, 2007. [ bib | DOI | Abstract ]
[149] J. Richiardi and A. Drygajlo. Reliability-based voting schemes using modality-independent features in multi-classifier biometric authentication. In Proc. 7th Int. Workshop on Multiple Classifier Systems, Prague, Czech Republic, 2007. Springer. [ bib | .pdf ]
[150] M. Germann, M. D. Breitenstein, I. K. Park, and H. Pfister. Automatic pose estimation for range images on the gpu. In Sixth International Conference on 3-D Digital Imaging and Modeling (3DIM 2007), pages 81–90. IEEE Computer Society, 2007. [ bib ]
[151] A. Thomas, V. Ferrari, B. Leibe, T. Tuytelaars, and L. van Gool. Depth-from-recognition: inferring metadata by cognitive feedback. In ICCV'07 Workshop on 3D Representations for Recognition, 2007. [ bib ]
[152] A. Vinciarelli and S. Favre. Broadcast news story segmentation using social network analysis and hidden markov models. In ACM International Conference on Multimedia, pages 261–264, 2007. IDIAP-RR 07-30. [ bib | Abstract ]
[153] P. Besson, V. Popovici, J. M. Vesin, J. Ph. Thiran, and M. Kunt. Extraction of audio features specific to speech production for multimodal speaker detection. IEEE Transactions on Multimedia, 2007. [ bib | DOI ]
[154] J. P. Pinto, H. Bourlard, A. Graves, and H. Hermansky. Comparing different word lattice rescoring approaches towards keyword spotting. Idiap-RR-32-2007 32, IDIAP, 2007. Submitted for publication. [ bib | Abstract ]
[155] D. Morrison, S. Marchand-Maillet, and E. Bruno. Hierarchical long-term learning for automatic image. In International Conference on Semantics And digital Media Technologies (SAMT 2007), Genova, IT, 2007. [ bib ]
[156] I. Bogdanova, X. Bresson, J. Ph. Thiran, and P. Vandergheynst. Scale-space analysis and active contours for omnidirectional images. IEEE Transactions on Image Processing, 16(7):1888–1901, 2007. [ bib | DOI ]
[157] P. Müller, G. Zeng, P. Wonka, and L. van Gool. Image-based procedural modeling of facades. In Proceedings of ACM SIGGRAPH 2007 / ACM Transactions on Graphics, volume 26, New York, NY, USA, 2007. ACM Press. [ bib ]
[158] A. Vinciarelli, F. Fernàndez, and S. Favre. Semantic segmentation of radio programs using social network analysis and duration distribution modeling. In IEEE International Conference on Multimedia and Expo (ICME), 2007. IDIAP-RR 06-75. [ bib | .ps.gz | .pdf | Abstract ]
[159] F. Evéquoz and D. Lalanne. Indexing and visualizing digital memories through personal email archive. pages 21–24, 2007. [ bib ]
[160] M. Liwicki, A. Schlapbach, P. Loretan, and H. Bunke. Automatic detection of gender and handedness from on-line handwriting. In Proc. 13th Conf. of the Graphonomics Society, pages 179–183, 2007. [ bib ]
[161] J. P. Pinto, P. R. M., B. Yegnanarayana, and H. Hermansky. Significance of contextual information in phoneme recognition. 2007. IDIAP-RR 07-28. [ bib | .ps.gz | .pdf ]
[162] M. Liwicki and H. Bunke. Combining on-line and off-line systems for handwriting recognition. In Proc. 9th Int. Conf. on Document Analysis and Recognition, pages 372–376, 2007. [ bib ]
[163] K. Kryszczuk and A. Drygajlo. Q-stack: uni- and multimodal classifier stacking with quality measures. In Proc. 7th Int. Workshop on Multiple Classifier Systems, Prague, Czech Republic, 2007. Springer. [ bib ]
[164] H. Bunke and M. Neuhaus. Graph matching – exact and error-tolerant methods and the automatic learning of edit costs. In D. J. Cook and L. B. Holder, editors, Mining Graph Data, pages 17–34. Wiley, 2007. [ bib ]
[165] S. Marcel, P. Abbet, and M. Guillemot. Google portrait. Idiap-Com Idiap-Com-07-2007, IDIAP, 2007. [ bib | Abstract ]
[166] V. Pallotta, V. Seretan, and M. Ailomaa. User requirement analysis for meeting information retrieval based on query elicitation. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics (ACL 2007), pages 1008–1015, Prague, Czech Republic, 2007. Association for Computational Linguistics. [ bib | .pdf ]
[167] D. Morrison, S. Marchand-Maillet, and E. Bruno. Automatic image annotation with relevance feedback and latent semantic analysis. In Workshop on Adaptive Multimedia Retrieval (AMR 2007), Paris, FR, 2007. [ bib ]
[168] A. Ess, A. Neubeck, and L. van Gool. Generalised linear pose estimation. In BMVC, 2007. in press. [ bib ]
[169] K. E. Ozden, K. Schindler, and L. van Gool. Simultaneous segmentation and 3d reconstruction of monocular image sequences. In International Conference on Computer Vision (ICCV'07), 2007. [ bib ]
[170] B. Leibe, K. Schindler, and L. van Gool. Coupled detection and trajectory estimation for multi-object tracking. In International Conference on Computer Vision (ICCV'07), 2007. [ bib ]
[171] A. Ess, B. Leibe, and L. van Gool. Depth and appearance for mobile scene analysis. In International Conference on Computer Vision (ICCV'07), 2007. [ bib ]
[172] T. Quack, V. Ferrari, B. Leibe, and L. van Gool. Efficient mining of frequent and distinctive feature configurations. In International Conference on Computer Vision (ICCV'07), 2007. [ bib ]
[173] M. Bray, E. Koller-Meier, and L. van Gool. Smart particle filtering for high-dimensional tracking. Computer Vision and Image Understanding, 2007. [ bib ]
[174] K. Kryszczuk, J. Richiardi, and A. Drygajlo. Reliability estimation for multimodal error prediction and fusion. In Proc. 7th Int. Workshop on Pattern Recognition in Information Systems (PRIS 2007), Funchal, Portugual, 2007. [ bib ]
[175] J. del R. Millán, P. W. Ferrez, F. Galán, E. Lew, and R. Chavarriaga. Non-invasive brain-actuated interaction. In Proceedings of the 2nd International Symposium on Brain, Vision and Artificial Intelligence, volume 4729, Naples, Italy, 2007. [ bib | DOI | Abstract ]
[176] F. Monay. Learning the structure of image collections with latent aspect models. In ., 2007. IDIAP-RR 07-06. [ bib | .pdf | Abstract ]
[177] G. Aradilla and J. Ajmera. Detection and recognition of number sequences within spoken utterances. In 2nd Workshop on Speech in Mobile and Pervasive Environments, 2007. [ bib | Abstract ]
[178] F. Aloise, N. Caporusso, D. Mattia, F. Babiloni, L. Kauhanen, J. del R. Millán, M. Nuttin, M. G. Marciani, and F. Cincotti. Brain-machine interfaces through control of electroencephalographic signals and vibrotactile feedback. In Proceedings of the 12th International Conference on Human-Computer Interaction, volume 125, Beijing, China, 2007. [ bib | Abstract ]
[179] K. Smith. Bayesian methods for visual multi-object tracking with applications to human activity recognition. PhD thesis, École Polytechnique Fédérale de Lausanne, Lausanne , Switzerland, 2007. Thèse sciences Ecole polytechnique fédérale de Lausanne EPFL, no 3745 (2007), Faculté des sciences et techniques de l'ingénieur STI, Section de génie électrique et électronique, Institut de génie électrique et électronique IEL (Laboratoire de l'IDIAP LIDIAP). Dir.: Hervé Bourlard, Daniel Gatica-Perez. [ bib | Abstract ]
[180] J. Hennebert, A. Humm, and R. Ingold. Modelling spoken signatures with gaussian mixture model adaptation. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 07), 2007. [ bib ]
[181] A. Drygajlo. Man-machine voice communication, pages 433–461. EPFL Press, 2007. [ bib | DOI ]
[182] O. Vinyals, G. Friedland, and N. Mirghafori. Revisiting a basic function on current cpus: A fast logarithm implementation with adjustable accuracy. ICSI Technical Report number TR-07-002, 2007. [ bib ]
[183] M. Liwicki, A. Graves, H. Bunke, and J. Schmidhuber. A novel approach to on-line handwriting recognition based on bidirectional long short-term memory networks. In Proc. 9th Int. Conf. on Document Analysis and Recognition, pages 367–371, 2007. [ bib ]
[184] F. Orabona, C. Castellini, B. Caputo, J. Luo, and G. Sandini. On-line independent support vector machines for cognitive systems. Idiap-RR Idiap-RR-63-2007, IDIAP, 2007. [ bib | Abstract ]
[185] P. W. Ferrez. Error-related eeg potentials in brain-computer interfaces. PhD thesis, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland, 2007. PhD Thesis #3928 at the École Polytechnique Fédérale de Lausanne. [ bib | Abstract ]
[186] G. Aradilla and H. Bourlard. Posterior-based features and distances in template matching for speech recognition. In 4th Joint Workshop on Multimodal Interaction and Related Machine Learning Algorithms (MLMI), volume 4892, pages 204–214, 2007. IDIAP-RR 07-41. [ bib | DOI | Abstract ]
[187] B. Mesot and D. Barber. A bayesian switching linear dynamical system for scale-invariant robust speech extraction. Technical report, Idiap Research Institute, 2007. [ bib | Abstract ]
[188] R. Villán, S. Voloshynovskiy, O. Koval, F. Deguillaume, and T. Pun. Tamper-proofing of electronic and printed text documents via robust hashing and data-hiding. In Proceedings of SPIE-IS&T Electronic Imaging 2007, Security, Steganography, and Watermarking of Multimedia Contents IX, San Jose, USA, 2007. [ bib | .pdf ]
[189] K. Riesen, M. Neuhaus, and H. Bunke. Graph embedding in vector spaces by means of prototype selection. In F. Escolano and M. Vento, editors, Graph-Based Representations in Pattern Recognition, volume 4538 of Lecture Notes in Computer Science, pages 383–393. Springer, 2007. [ bib ]
[190] A. Stolcke, X. Anguera, K. Boakye, O. Cetin, A. Janin, M. Magimai-Doss, C. Wooters, and J. Zheng. The sri-icsi spring 2007 meeting and lecture recognition system. Lecture Notes in Computer Science, 2007. [ bib ]
[191] A. Stolcke, S. Kajarekar, L. Ferrer, and E. Shriberg. Speaker recognition with session variability normalization based on mllr adaptation transforms. IEEE Transactions on Audio, Speech, and Language Processing, special issue on speaker and language recognition, 2007. [ bib ]
[192] H. Hung, D. Jayagopi, C. Yeo, G. Friedland, S. Ba, J. M. Odobez, K. Ramchandran, N. Mirghafori, and D. Gatica-Perez. Using audio and video features to classify the most dominant person in a group meeting multi-layer background subtraction based on color and texture. In Proc. ACM Multi Media, Augsburg, Germany, 2007. [ bib ]
[193] K. Kryszczuk, J. Richiardi, P. Prodanov, and A. Drygajlo. Reliability-based decision fusion in multimodal biometric verification systems. EURASIP Journal of Advances in Signal Processing, 2007. (in press). [ bib ]
[194] M. Rigamonti, D. Lalanne, and R. Ingold. Faericworld: browsing multimedia events through static documents and links. In In proc. of INTERACT 2007, LNCS, page to appear, Rio De Janeiro, Brasil, 2007. Springer-Verlag. [ bib ]
[195] J. Hennebert, R. Loeffel, A. Humm, and R. Ingold. A new forgery scenario based on regaining dynamics of signature. In Accepted for publication, International Conference on Biometrics (ICB 2007), Seoul Korea, 2007. [ bib ]
[196] J. M. Pardo, X. Anguera, and C. Wooters. Speaker diarization for multiple-distant-microphone meetings using several sources of information. to appear in IEEE Transactions on Computers, 2007. [ bib ]
[197] A. Jaimes, D. Gatica-Perez, N. Sebe, and T. S. Huang. Human-centered computing: toward a human revolution. IEEE Computer, 40(5), 2007. IDIAP-RR 07-57. [ bib | DOI | Abstract ]
[198] F. Galán, J. Palix, R. Chavarriaga, P. W. Ferrez, E. Lew, C. A. Hauert, and J. del R. Millán. Visuo-spatial attention frame recognition for brain-computer interfaces. In Proceedings of the 1st International Conference on Cognitive Neurodynamics, Shanghai, China, 2007. [ bib | Abstract ]
[199] A. Schlapbach and H. Bunke. Fusing asynchronous feature streams for on-line writer identification. In Proc. 9th Int. Conf. on Document Analysis and Recognition, pages 103–107, 2007. [ bib ]
[200] S. R. Mahadeva Prasanna, B. Yegnanarayana, J. P. Pinto, and H. Hermansky. Analysis of confusion matrix to combine evidence for phoneme recognition. Idiap-RR-27-2007 27, IDIAP, 2007. Submitted for publication. [ bib | Abstract ]
[201] T. Quack, V. Ferrari, B. Leibe, and L. van Gool. Efficient mining of frequent and distinctive feature configurations. In accepted for ICCV'07, 2007. [ bib ]
[202] A. Humm, J. Hennebert, and R. Ingold. Modelling combined handwriting and speech modalities. In Accepted for publication, International Conference on Biometrics (ICB 2007), Seoul Korea, 2007. [ bib ]
[203] J. P. Pinto, A. Lovitt, and H. Hermansky. Exploiting phoneme similarities in hybrid hmm-ann keyword spotting. In Proceedings of Interspeech, 2007. IDIAP-RR 07-11. [ bib | .ps.gz | .pdf | Abstract ]
[204] E. Kokiopoulou and P. Frossard. Accelarating distributed consensus using extrapolation. IEEE Signal Processing Letters, 14(10):665–668, 2007. [ bib ]
[205] A. Schlapbach and H. Bunke. A writer identification and verification system using hmm based recognizers. Pattern Analysis and Applications, 10(1):33–43, 2007. [ bib ]
[206] F. Valente and H. Hermansky. Combination of acoustic classifiers based on dempster-shafer theory of evidence. In IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), 2007. IDIAP-RR 06-61. [ bib | .ps.gz | .pdf | Abstract ]
[207] A. Vinciarelli. Mapping nonverbal communication into social status: automatic recognition of journalists and non-journalists in radio news. IDIAP-RR 33, IDIAP, 2007. Submitted for publication. [ bib | .ps.gz | .pdf | Abstract ]
[208] M. Plauché, O. Cetin, and N. Uhdaykumar. How to build a spoken dialog system with limited (or no) resources. AI in ICT for Development Workshop of the Twentieth Intl. Joint Conf. on AI, Hyderabad, India, 2007. [ bib ]
[209] P. Quelhas, J. M. Odobez, D. Gatica-Perez, and T. Tuytelaars. A thousand words in a scene. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(9):151575–1589, 2007. IDIAP-RR 05-40. [ bib | DOI | Abstract ]
[210] F. Valente, J. Vepa, C. Plahl, C. Gollan, H. Hermansky, and R. Schlüter. Hierarchical neural networks feature extraction for lvcsr system. In Interspeech 2007, 2007. IDIAP-RR 07-08. [ bib | .ps.gz | .pdf | Abstract ]
[211] T. Jaeggli, E. Koller-Meier, and L. van Gool. Learning generative models for monocular body pose estimation. In ACCV, 2007. [ bib ]
[212] K. Schindler, D. Suter, and H. Wang. A model-selection framework for multibody structure-and-motion of image sequences. International Journal of Computer Vision, 79(2):159–177, 2007. [ bib ]
[213] T. Jaeggli, E. Koller-Meier, and L. van Gool. Multi-activity tracking in lle body pose space. In 2nd Workshop on HUMAN MOTION Understanding, Modeling, Capture and Animation, ICCV, 2007. [ bib ]
[214] X. Bresson, S. Esedoglu, P. Vandergheynst, J. Ph. Thiran, and S. Osher. Fast global minimization of the active contour/snake model. Journal of Mathematical Imaging and Vision, 28(2):151–167, 2007. [ bib | DOI | http ]
[215] J. Kludas, E. Bruno, and S. Marchand-Maillet. Information fusion in multimedia information retrieval. In Workshop on Adaptive Multimedia Retrieval (AMR 2007), Paris, FR, 2007. [ bib ]
[216] A. Lovitt. Correcting confusion matrices for phone recognizers. IDIAP-COM 03, IDIAP, 2007. [ bib | .ps.gz | .pdf | Abstract ]
[217] M. Levit, D. Hakkani-Tur, G. Tur, and D. Gillick. Integrating several annotation layers for statistical information distillation. In Workshop on Automatic Speech Recognition and Understanding, 2007. [ bib ]
[218] L. Piccardi, B. Noris, O. Barbey, G. Schiavone, F. Keller, C. Von Hofsten, and A. Billard. Wearcam: a head mounted wireless camera for monitoring gaze attention and for the diagnosis of developmental disorders in young children. In 16th IEEE International Symposium on Robot & Human Interactive Communication, RO-MAN, Special Session: Applications of Robotics and Intelligent System, 2007. [ bib ]
[219] A. Popescu-Belis and P. Estrella. Generating usable formats for metadata and annotations in a large meeting corpus. In ACL 2007, 45th International Conference of the for Computation, pages 93–96. ACL 2007, 2007. [ bib ]
[220] O. Koval, S. Voloshynovskiy, and T. Pun. Error exponent analysis of person identification based on fusion of dependent/independent modalities. In Proceedings of SPIE-IS&T Electronic Imaging 2007, Security, Steganography, and Watermarking of Multimedia Contents IX, San Jose, USA, 2007. [ bib ]
[221] R. Hérault and Y. Grandvalet. Sparse probabilistic classifiers. In International Conference on Machine Learning (ICML), 2007. IDIAP-RR 07-19. [ bib | .ps.gz | .pdf | Abstract ]
[222] D. Lalanne, F. Evéquoz, H. Chiquet, M. Müller, M. Radgohar, and R. Ingold. Going through digital versus physical augmented gaming. In Tangible Play: Research and Design for Tangible and Tabletop Games. Workshop at the 2007 Intelligent User Interfaces Conference (IUI'07), pages 41–44, Hawaii (USA), 2007. [ bib ]
[223] E. Kokiopoulou and P. Frossard. Accelerating distributed consensus using extrapolation. IEEE Signal Processing Letters, 14(10), 2007. [ bib | DOI | http ]
[224] A. Humm, J. Hennebert, and R. Ingold. Spoken handwriting verification using statistical models. In Accepted for publication, International Conference on Document Analysis and Recognition (ICDAR 07), Curitiba Brazil, 2007. [ bib ]
[225] M. Georgescul, A. Clark, and S. Armstrong. Exploiting structural meeting-specific features for topic segmentation. In Actes de la 14ème Conférence sur le Traitement Automatique des Langues Naturelles, 2007. [ bib ]
[226] F. Monay and D. Gatica-Perez. Modeling semantic aspects for cross-media image indexing. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29:1802–1817, 2007. IDIAP-RR 05-56. [ bib | DOI | Abstract ]
[227] M. Broschart, C. de Negueruela, J. del R. Millán, and C. Menon. Augmenting astronaut's capabilities through brain-machine interfaces. In Proceedings of the 20th International Joint Conference on Artificial Intelligence, Workshop on Artificial Intelligence for Space Applications, Hyderabad, India, 2007. [ bib | Abstract ]
[228] B. Fasel and L. van Gool. Interactive museum guide: accurate retrieval of object descriptions. In S. Marchand-Maillet, E. Bruno, A. Nürnberger, and M. Detyniecki, editors, Adaptive Multimedia Retrieval: User, Context, and Feedback, pages 179–191. Springer, 2007. [ bib ]
[229] G. Bologna, B. Deville, T. Pun, and M. Vinckenbosch. Identifying major components of pictures by audio encoding of colors. In IWINAC2007, 2nd. Int. Work-conf. on the Interplay between Natural and Artificial Computation, Murcia, Spain, 2007. [ bib ]
[230] S. Chiappa and D. Barber. Bayesian factorial linear gaussian state-space models for biosignal decomposition. IEEE Signal Processing Letters, 2007. IDIAP-RR 05-84. [ bib | .pdf | Abstract ]
[231] T. Kaufmann and B. Pfister. Applying licenser rules to a grammar with continuous constituents. In The Proceedings of the 14th International Conference on Head-Driven Phrase Structure Grammar, 2007. [ bib ]
[232] S. Bengio and J. Mariéthoz. Biometric person authentication is a multiple classifier problem. In 7th International Workshop on Multiple Classifier Systems, MCS, 2007. IDIAP-RR 07-03. [ bib | .ps.gz | .pdf | Abstract ]
[233] R. Grave de Peralta Menendez, S. L. González Andino, P. W. Ferrez, and J. del R. Millán. Non-invasive estimates of local field potentials for brain-computer interfaces. In G. Dornhege, J. del R. Millán, T. Hinterberger, D. McFarland, and K. R. Müller, editors, Towards Brain-Computer Interfacing. The MIT Press, 2007. [ bib | Abstract ]
[234] L. van Gool, G. Zeng, F. van den Borre, and P. Müller. Towards mass-produced building models. In U. Stilla, H. Mayer, F. Rottensteiner, C. Heipke, and S. Hinz, editors, Photogrammetric Image Analysis, pages 209–220. Institute of Photogrammetry and Cartography, Technische Universitaet Muenchen, 2007. [ bib ]
[235] J. Meynet, V. Popovici, and J. Ph. Thiran. Mixtures of boosted classifiers for frontal face detection. Signal, Image and Video Processing, 1(1):29–38, 2007. [ bib | DOI | http ]
[236] D. G. Zacharie and J. P. Pinto. Keyword spotting on word lattices. IDIAP-RR 22, IDIAP, 2007. [ bib | .ps.gz | .pdf ]
[237] L. Uldry, P. W. Ferrez, and J. del R. Millán. Feature selection methods on distributed linear inverse solutions for a non-invasive brain-machine interface. IDIAP-COM 04, IDIAP, 2007. [ bib | .ps.gz | .pdf ]
[238] E. Bruno, J. Kludas, and S. Marchand-Maillet. Combining multimodal preferences for multimedia information retrieval. In Proc. of International Workshop on Multimedia Information Retrieval, Augsburg, Germany, 2007. [ bib ]
[239] J. del R. Millán. Tapping the mind or resonating minds? In P. T. Kidd, editor, European Visions for the Knowledge Age. Cheshire Henbury, 2007. [ bib | Abstract ]
[240] E. Kron, M. Rayner, M. Santaholma, and P. Bouillon. A development environment for building grammar-based speech-enabled applications. In Proceedings of workshop on Grammar-based approaches to spoken language processing, pages 49–52. ACL 2007, 2007. [ bib ]
[241] A. Vinciarelli. Role recognition in broadcast news using social network analysis and duration distribution modeling. IEEE Transactions on Multimedia, 2007. IDIAP-RR 06-35. [ bib | .ps.gz | .pdf | Abstract ]
[242] L. Chen, D. Barber, and J. M. Odobez. Dynamical dirichlet mixture model. IDIAP-RR 02, IDIAP, 2007. [ bib | .ps.gz | .pdf | Abstract ]
[243] J. Zheng, O. Cetin, M. Y. Hwang, X. Lei, A. Stolcke, and N. Morgan. Combining discriminative feature, transform, and model training for large vocabulary speech recognition. Proc. ICASSP, Honolulu., 2007. [ bib ]
[244] M. Gerber, T. Kaufmann, and B. Pfister. Perceptron-based class verification. In Proceedings of NOLISP (ISCA Workshop on non linear speech processing), Paris, 2007. [ bib ]
[245] P. Motlicek, H. Hermansky, S. Ganapathy, and H. Garudadri. Non-uniform speech/audio coding exploiting predictability of temporal evolution of spectral envelopes. In Tenth International Conference on TEXT, SPEECH and DIALOGUE (TSD) [13], pages 350–357. IDIAP-RR 06-30. [ bib | Abstract ]
[246] J. Philips, J. del R. Millán, G. Vanacker, E. Lew, F. Galán, P. W. Ferrez, H. van Brussel, and M. Nuttin. Adaptive shared control of a brain-actuated simulated wheelchair. In Proceedings of the 10th IEEE International Conference on Rehabilitation Robotics, pages 408–414, Noordwijk, The Netherlands, 2007. [ bib | DOI | Abstract ]
[247] G. Heusch and S. Marcel. Face authentication with salient local features and static bayesian network. In IEEE / IAPR Intl. Conf. On Biometrics (ICB), 2007. IDIAP-RR 07-04. [ bib | .ps.gz | .pdf | Abstract ]
[248] O. Cetin, A. Kantor, S. King, C. Bartels, M. Magimai-Doss, J. Frankel, and K. Livescu. An articulatory feature-based tandem approach and factored observation modeling. Proc. ICASSP, Honolulu, 2007. [ bib ]
[249] A. Behera, D. Lalanne, and R. Ingold. Docmir: an automatic document-based indexing system for meeting retrieval. Multimedia Tools and Applications, 37(2), 2007. [ bib ]
[250] M. Gurban, A. Valles, and J. Ph. Thiran. Low-dimensional motion features for audio-visual speech recognition. In 15th European Signal Processing Conference (EUSIPCO), Poznan, Poland, Poznan, Poland, 2007. [ bib | http ]
[251] T. Weise, B. Leibe, and L. van Gool. Fast 3d scanning with automatic motion compensation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR'07), 2007. [ bib ]
[252] B. Leibe, N. Cornelis, K. Cornelis, and L. van Gool. Dynamic 3d scene analysis from a moving vehicle. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR'07), 2007. [ bib ]
[253] F. Evéquoz and D. Lalanne. Personal information management through interactive visualizations. pages 158–160, 2007. [ bib ]
[254] T. Kaufmann and B. Pfister. An hpsg parser supporting discontinuous licenser rules. In International Conference on HPSG, 2007. (to appear). [ bib ]
[255] A. Jaimes, D. Gatica-Perez, N. Sebe, and T. S. Huang. Guest editors' introduction: Human-centered computing-toward a human revolution. Computer, 40(5):30–34, 2007. [ bib ]
[256] M. Magimai-Doss, D. Hakkani-Tur, O. Cetin, E. Shriberg, J. Fung, and N. Mirghafori. Entropy based classifier combination for sentence segmentation,. Proc. ICASSP, Honolulu, 2007. [ bib ]
[257] J. Mariéthoz and S. Bengio. A kernel trick for sequences applied to text-independent speaker verification systems. Pattern Recognition, 40(8), 2007. IDIAP-RR 05-77. [ bib | Abstract ]
[258] S. Ba. Joint head tracking and pose estimation for visual focus of attention recognition. PhD thesis, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland, 2007. Thèse sciences Ecole polytechnique fédérale de Lausanne EPFL, no 3764 (2007), Faculté des sciences et techniques de l'ingénieur STI, Section de génie électrique et électronique, Institut de génie électrique et électronique IEL (Laboratoire de l'IDIAP LIDIAP). Dir.: Hervé Bourlard, Jean-Marc Odobez. [ bib | Abstract ]
[259] A. Graves, M. Liwicki, and H. Bunke. Unconstrained on-line handwriting recognition with recurrent neural networks. In Advances in Neural Information Processing, volume 20 of NIPS, Vancouver, 2007. [ bib ]
[260] I. Laptev, B. Caputo, and T. Lindberg. Local velocity-adapted motion events for spatio-temporal recognition. Computer Vision and Image Undertanding, 108(3):207–229, 2007. [ bib | Abstract ]
[261] G. Chanel, K. Ansari-Asl, and T. Pun. Valence-arousal evaluation using physiological signals in an emotion recall paradigm. In 2007 IEEE SMC, Int. Conf. on Systems, Man and Cybernetics, Smart cooperative systems and cybernetics: advancing knowledge and security for humanity, Montreal, Canada, 2007. [ bib ]
[262] C. Müller and F. Burkhardt. Combining short-term cepstral and long-term pitch features for automatic recognition of speaker age. to appear in Proceedings of Interspeech, Antwerp., 2007. [ bib ]
[263] D. Lalanne, F. Evéquoz, M. Rigamonti, B. Dumas, and R. Ingold. An ego-centric and tangible approach to meeting indexing and browsing. In 4th Joint Workshop on Multimodal Interaction and Related Machine Learning Algorithms (MLMI'07), page to appear, Brno (Czech Republic), 2007. [ bib ]
[264] S. Marcel, Y. Rodriguez, and G. Heusch. On the recent use of local binary patterns for face authentication. International Journal on Image and Video Processing Special Issue on Facial Image Processing, 2007. IDIAP-RR 06-34. [ bib | Abstract ]
[265] D. Dessimoz, J. Richiardi, C. Champod, and A. Drygajlo. Multimodal biometrics for identity documents (mbioid). Forensic Science International, 167:154–159, 2007. [ bib | DOI ]
[266] M. Knox and N. Mirghafori. Automatic laughter detection using neural networks. to appear in Proceedings of Interspeech, Antwerp., 2007. [ bib ]
[267] S. Cuendet, D. Hakkani-Tur, and E. Shriberg. Automatic labeling inconsistencies detection and correction for sentence unit segmentation in conversational speech. to appear in Proceedings of MLMI, Brno, Czech Republic, 2007. [ bib ]
[268] A. Popescu-Belis. Evaluation of nlg: some analogies and differences with mt and reference resolution. In MT Summit XI Workshop on Using Corpora for NLG and MT (UCNLG MT), pages 66–68, Copenhagen, Denmark, 2007. [ bib ]
[269] M. Neuhaus and H. Bunke. Bridging the gap between graph edit distance and kernel machines, volume 68 of Machine Perception and Artificial Intelligence. World Scientific, 2007. [ bib ]
[270] J. M. Odobez and S. Ba. A cognitive and unsupervised map adaptation approach to the recognition of the focus of attention from head pose. In International Conference on Multi-Media & Expo (ICME07), 2007. IDIAP-RR 07-20. [ bib | .ps.gz | .pdf | Abstract ]
[271] M. Starlander. Using a wizard of oz as a baseline to determine which system architecture is the best for a spoken language translation system. In Proceedings of Nodalida 2007, 16th Nordic Conference of Computational Linguistics, pages 161–164, 2007. [ bib ]
[272] M. Y. Hwang, G. Peng, W. Wang, A. Faria, A. Heidel, and M. Ostendorf. Building a highly accurate mandarin speech recognizer. IEEE workshop on Automatic Speech Recognition and Understanding (ASRU 07), Kyoto, 2007. [ bib ]
[273] D. Grangier and S. Bengio. Learning the inter-frame distance for discriminative template-based keyword detection. In International Conference on Speech Communication and Technology (INTERSPEECH), 2007. [ bib | .ps.gz | .pdf | Abstract ]
[274] D. Lalanne and E. van den Hoven. Supporting human memory with interactive systems. pages 215–216, 2007. [ bib ]
[275] F. Cincotti, D. Mattia, F. Aloise, S. Bufalari, L. Astolfi, F. De Vico Fallani, A. Tocci, L. Bianchi, M. G. Marciani, S. Gao, J. del R. Millán, and F. Babiloni. High-resolution eeg techniques for brain-computer interface applications. Journal of Neuroscience Methods, 167:31–42, 2007. [ bib | Abstract ]
[276] C. Wooters and M. Huijbregts. The icsi rt07s speaker diarization system. to appear in Lecture Notes in Computer Science, 2007. [ bib ]
[277] E. Bruno, J. Kludas, and S. Marchand-Maillet. Combining multimodal preferences for multimedia information retrieval. In ACM SIGMM - International Workshop on Multimedia Information Retrieval, Ausburg, DE, 2007. [ bib ]
[278] R. Chavarriaga, P. W. Ferrez, and J. del R. Millán. To err is human: Learning from error potentials in brain-computer interfaces. In 1st International Conference on Cognitive Neurodynamics (ICCN 2007), Shanghai, China, 2007. [ bib ]
[279] L. Stoll, J. Frankel, and N. Mirghafori. Speaker recognition via nonlinear discriminant features. Proceedings of NOLISP, Paris, France,, 2007. [ bib ]
[280] J. del R. Millán, P. W. Ferrez, and A. Buttfield. The idiap brain-computer interface: an asynchronous multi-class approach. In G. Dornhege, J. del R. Millán, T. Hinterberger, D. McFarland, and K. R. Müller, editors, Towards Brain-Computer Interfacing. The MIT Press, 2007. [ bib | Abstract ]
[281] X. Anguera, C. Wooters, and J. Hernando. Acoustic beamforming for speaker diarization of meetings. to appear in IEEE Transactions on Audio, Speech and Language Processing, 2007. [ bib ]
[282] F. Frapolli, B. Hirsbrunner, and D. Lalanne. Dynamic rules: towards interactive games intelligence. In Tangible Play: Research and Design for Tangible and Tabletop Games. Workshop at the 2007 Intelligent User Interfaces Conference (IUI'07), pages 29–32, Hawaii (USA), 2007. [ bib ]
[283] G. Dornhege, J. del R. Millán, T. Hinterberger, D. McFarland, and K. R. Müller. Towards brain-computer interfacing. The MIT Press, 2007. [ bib ]
[284] K. Riesen, M. Neuhaus, and H. Bunke. Bipartite graph matching for computing the edit distance of graphs. In F. Escolano and M. Vento, editors, Graph-Based Representations in Pattern Recognition, volume 4538 of Lecture Notes in Computer Science, pages 1–12. Springer, 2007. [ bib ]
[285] T. Drugman, M. Gurban, and J. Ph. Thiran. Relevant feature selection for audio-visual speech recognition. In 9th International Workshop on Multimedia Signal Processing (MMSP), 2007. [ bib | http | Abstract ]
[286] J. Keshet. Theoretical foundations for large-margin kernel-based continuous speech recognition. Idiap-RR Idiap-RR-44-2007, IDIAP, 2007. [ bib ]
[287] S. Cuendet, E. Shriberg, B. Favre, J. Fung, and D. Hakkani-Tur. An analysis of sentence segmentation features for broadcast news, broadcast conversations, and meetings. In SIGIR Workshop on Searching Conversational Spontaneous Speech, 2007. [ bib ]
[288] G. Vanacker, J. del R. Millán, E. Lew, P. W. Ferrez, F. Galán, J. Philips, H. van Brussel, and M. Nuttin. Context-based filtering for assisted brain-actuated wheelchair driving. Computational Intelligence and Neuroscience, 2007:3, 2007. [ bib | Abstract ]
[289] P. Bouillon, N. Chatzichrisafis, S. Halimi, B. A. Hockey, H. Isahara, K. Kanzaki, Y. Nakao, B. Novellas Vall, M. Rayner, M. Santaholma, and M. Starlander. Medslt: a multi-lingual grammar-based medical speech translator. In Proceedings of First International Workshop on Intercultural Collaboration. IWIC2007, 2007. [ bib ]
[290] E. Kokiopoulou and P. Frossard. Dimensionality reduction with adaptive approximation. In IEEE Int. Conf. on Multimedia & Expo (ICME), 2007. [ bib | http ]
[291] A. Popescu-Belis. Le rôle des métriques d'évaluation dans le processus de recherche en tal. TAL (Traitement Automatique des Langues), 47(2), 2007. [ bib ]
[292] J. Meynet and J. Ph. Thiran. Information theoretic combination of classifiers with application to adaboost. In 7th international Workshop on Multiple Classifier Systems (MCS), Prague, 2007. ITS. [ bib ]
[293] F. Valente, H. Bourlard, and V. Deepu. Agglomerative information bottleneck for speaker diarization of meetings data. IDIAP-RR 31, IDIAP, 2007. Submitted for publication. [ bib | .ps.gz | .pdf | Abstract ]
[294] U. Hoffmann, J. M. Vesin, and T. Ebrahimi. Recent advances in brain-computer interfaces. In IEEE International Workshop on Multimedia Signal Processing, 2007. Invited Paper. [ bib | http | Abstract ]
[295] J. Meynet, V. Popovici, and J. Ph. Thiran. Face detection with boosted gaussian features. Pattern Recognition, 40(8):2283–2291, 2007. [ bib | DOI | Abstract ]
[296] G. Bologna, B. Deville, T. Pun, and M. Vinckenbosch. Transforming 3d coloured pixels into musical instrument notes for vision substitution applications. Eurasip J. of Image and Video Processing, Special Issue: Image and Video Processing for Disability, accepted for publication, 2007. (to appear). [ bib ]
[297] Y. Huang, G. Friedland, C. Müller, and N. Mirghafori. Speeding up speaker diarization by using prosodic features. Technical Report TR-07-004, International Computer Science Institute, Berkeley, California, 2007. [ bib ]
[298] Y. Huang, O. Vinyals, G. Friedland, C. Müller, N. Mirghafori, and C. Wooters. A fast-match approach for robust, faster than real-time speaker diarization. IEEE workshop on Automatic Speech Recognition and Understanding (ASRU 07), Kyoto, 2007. [ bib ]
[299] Y. Huang. Robust and rapid speaker diarization. Master Thesis, University of California, Berkeley, 2007. [ bib ]
[300] J. del R. Millán, A. Buttfield, C. Vidaurre, M. Krauledat, A. Schlögl, P. Shenoy, B. Blankertz, R. P. N. Rao, R. Cabeza, G. Pfurtscheller, and K. R. Müller. Adaptation in brain-computer interfaces. In G. Dornhege, J. del R. Millán, T. Hinterberger, D. McFarland, and K. R. Müller, editors, Towards Brain-Computer Interfacing. The MIT Press, 2007. [ bib | Abstract ]
[301] I. McCowan, H. K. Maganti, and D. Gatica-Perez. Speech enhancement and recognition in meetings with an audio-visual sensor array. IEEE Trans. on Audio, Speech, and Language Processing, 15(8):2257–2269, 2007. [ bib ]
[302] G. Aradilla, J. Vepa, and H. Bourlard. An acoustic model based on kullback-leibler divergence for posterior features. In IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), 2007. IDIAP-RR 06-60. [ bib | .ps.gz | .pdf | Abstract ]
[303] A. Pronobis and B. Caputo. Confidence-based cue integration for visual place recognition. IDIAP-RR 17, IDIAP, 2007. [ bib | .ps.gz | .pdf | Abstract ]
[304] B. Mesot and D. Barber. A gaussian sum smoother for inference in switching linear dynamical systems. Technical report, Idiap Research Institute, 2007. [ bib ]
[305] J. Yao and J. M. Odobez. Multi-layer background subtraction based on color and texture. In CVPR 2007 Workshop on Visual Surveillance (VS2007), volume 17-22, pages 1–8, 2007. [ bib | DOI | Abstract ]
[306] P. Motlicek, H. Hermansky, S. Ganapathy, and H. Garudadri. Frequency domain linear prediction for qmf sub-bands and applications to audio coding. In 4th Joint Workshop on Multimodal Interaction and Related Machine Learning Algorithms (MLMI) [95], pages 248–258. IDIAP-RR 07-16. [ bib | Abstract ]
[307] A. Rakotomamonjy, F. Bach, S. Canu, and Y. Grandvalet. More efficiency in multiple kernel learning. In International Conference on Machine Learning (ICML), 2007. IDIAP-RR 07-18. [ bib | .ps.gz | .pdf | Abstract ]
[308] A. Lovitt. Truncation confusion patterns in onset consonants. In Interspeech 2007, 2007. IDIAP-RR 07-05. [ bib | .ps.gz | .pdf | Abstract ]
[309] H. Lei and N. Mirghafori. Word-conditioned hmm supervectors for speaker recognition. to appear in Proceedings of Interspeech, Antwerp., 2007. [ bib ]
[310] H. Lei and N. Mirghafori. Word-conditioned phone n-grams for speaker recognition. Proc. ICASSP, Honolulu, 2007. [ bib ]
[311] M. Gerber, R. Beutler, and B. Pfister. Quasi text-independent speaker verification based on pattern matching. In Proceedings of Interspeech. ISCA, 2007. [ bib ]
[312] R. Bertolami, S. Uchida, M. Zimmermann, and H. Bunke. Non-uniform slant correction for handwritten text line recognition. In Proc. 9th Int. Conf. on Document Analysis and Recognition, pages 18–22, 2007. [ bib ]
[313] D. Hakkani-Tur and G. Tur. Statistical sentence extraction for information distillation. Proc. ICASSP, Honolulu, 2007. [ bib ]
[314] D. Lalanne, E. Bertini, P. Hertzog, and P. Bados. Visual analysis of corporate network intelligence: abstracting and reasoning on yesterdays for acting today. 2007. [ bib ]
[315] K. Kryszczuk and A. Drygajlo. Improving classification with class-independent quality measures: q-stack in face verification. In Proc. 2nd Int. Conference in Biometrics (ICB 2007), Seoul, South Korea, 2007. [ bib ]
[316] J. Hennebert. Please repeat: my voice is my password. from the basics to real-life implementations of speaker verification technologies. In Invited lecture at the Information Security Summit (IS2 2007), Prague, 2007. [ bib ]
[317] S. Marchand-Maillet, E. Bruno, A. Nürnberger, and M. Detyniecki. Adaptive multimedia retrieval: user, context and feedback. Springer, 2007. [ bib ]
[318] F. Galán, M. Nuttin, E. Lew, P. W. Ferrez, G. Vanacker, J. Philips, H. van Brussel, and J. del R. Millán. An asynchronous and non-invasive brain-actuated wheelchair. In Proceedings of the 13th International Symposium on Robotics Research, volume 128, Hiroshima, Japan, 2007. [ bib | Abstract ]
[319] A. Lovitt, J. P. Pinto, and H. Hermansky. On confusions in a phoneme recognizer. 2007. IDIAP-RR 07-10. [ bib | .ps.gz | .pdf | Abstract ]
[320] H. Bay, A. Ess, T. Tuytelaars, and L. van Gool. Speeded-up robust features (surf). Computer Vision and Image Understanding (CVIU), 2007. [ bib ]
[321] O. Koval, S. Voloshynovskiy, and T. Pun. Analysis of multimodal binary detection systems based on dependent/independent modalities. In Proceedings of the IEEE 2007 International Workshop on Multimedia Signal Processing, Chania, Crete, Greece, 2007. [ bib ]
[322] R. Rytsar and T. Pun. Computational aspects of the eeg forward problem solution for real head model using finite element. In 29th Annual Int. Conf. IEEE Engineering in Medicine and Biology Society, Lyon, France, 2007. [ bib ]
[323] E. Bertini, P. Hertzog, and D. Lalanne. Spiralview: a visual tool to improve monitoring and understanding of security data in corporate. In IEEE Symposium on Visual Analytics Science and Technology 2007 (VAST'07), page to appear, Sacramento, CA (USA), 2007. [ bib ]
[324] H. Romsdorfer and B. Pfister. Text analysis and language identification for polyglot text-to-speech synthesis. Speech Communication (Elsevier), 2007. (to appear). [ bib ]
[325] X. Anguera, C. Wooters, J. M. Pardo, and J. Hernando. Automatic weighting for the combination of tdoa and acoustic features in speaker diarization for meetings. Proc. ICASSP, Honolulu, 2007. [ bib ]
[326] X. Anguera, T. Shinozaki, C. Wooters, and J. Hernando. Model complexity selection and cross-validation em training for robust speaker diarization. Proc. ICASSP, Honolulu, 2007. [ bib ]
[327] Y. Liu and E. Shriberg. Comparing evaluation metrics for sentence boundary detection. Proc. ICASSP, Honolulu, 2007. [ bib ]
[328] M. Liwicki and H. Bunke. Handwriting recognition of whiteboard notes – studying the influence of training set size and type. Int. Journal of Pattern Recognition and Art. Intelligence, 21(1):83–98, 2007. [ bib ]
[329] S. Ba and J. M. Odobez. Probabilistic head pose tracking evaluation in single and multiple camera setups. In Classification of Events, Activities and Relationship Evaluation and Workshop, 2007. IDIAP-RR 07-21. [ bib | .ps.gz | .pdf | Abstract ]
[330] A. Stolcke, S. Kajarekar, L. Ferrer, and E. Shriberg. Speaker recognition with session variability normalization based on mllr adaptation transforms. IEEE Transactions on Audio, Speech, and Language Processing, 15:1987–1998, 2007. [ bib ]
[331] F. Galán, P. W. Ferrez, F. Oliva, J. Guàrdia, and J. del R. Millán. Feature extraction for multi-class bci using canonical variates analysis. IDIAP-RR 23, IDIAP, 2007. Submitted for publication. [ bib | .ps.gz | .pdf | Abstract ]
[332] G. Lathoud and J. M. Odobez. Short-term spatio-temporal clustering applied to multiple moving speakers. IEEE Transactions on Audio, Speech and Language Processing, 2007. [ bib | Abstract ]
[333] A. Humm, J. Hennebert, and R. Ingold. Database and evaluation protocols for user authentication using combined handwriting and speech modalities. Technical report, Department of Informatics, University of Fribourg, Switzerland, 2007. [ bib ]
[334] S. Ba and J. M. Odobez. Multi-person visual focus of attention from head pose and meeting contextual cues. Idiap-RR Idiap-RR-47-2008, Idiap, 2008. IDIAP-RR 08-47. [ bib | .pdf | Abstract ]
[335] B. Favre, R. Grishman, D. Hillard, H. Ji, D. Hakkani-Tur, and M. Ostendorf. Punctuating speech for information extraction. IEEE ICASSP, Las Vegas, NV, 2008. [ bib ]
[336] H. Hung, Y. Huang, G. Friedland, and D. Gatica-Perez. Estimating the dominant person in multi-party conversations using speaker diarization strategies. IEEE ICASSP, Las Vegas, NV, 2008. [ bib ]
[337] M. Liwicki and H. Bunke. Recognition of whiteboard notes – online, offline and combination. World Scientific, 2008. [ bib ]
[338] P. W. Ferrez and J. del R. Millán. Eeg-based brain-computer interaction: improved accuracy by automatic single-trial error detection. In Advances in Neural Information Processing Systems 20, pages 441–448, 2008. [ bib | Abstract ]
[339] S. H. K. Parthasarathi and H. Hermansky. A data-driven approach to speech/non-speech detection. Idiap-RR Idiap-RR-23-2008, IDIAP, 2008. [ bib | Abstract ]
[340] S. Ba and J. M. Odobez. Recognizing visual focus of attention from head pose in natural meetings. accepted for publication in IEEE Trans. on System, Man and Cybernetics: Part B, Man,, 2008. [ bib ]
[341] E. Kokiopoulou and P. Frossard. Minimum distance between pattern transformation manifolds: algorithm and applications. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008. [ bib ]
[342] H. Bunke, P. Dickinson, M. Neuhaus, and M. Stettler. Matching of hypergraphs – algorithms, applications, and experiments. In H. Bunke, A. Kandel, and M. Last, editors, Applied Pattern Recognition, pages 131–154. Springer, 2008. [ bib ]
[343] A. Singla and D. Hakkani-Tur. Cross-lingual sentence extraction for information distillation. to appear in Proceedings of Interspeech 2008, Brisbane, Australia, 2008. [ bib ]
[344] S. H. K. Parthasarathi, P. Motlicek, and H. Hermansky. Exploiting contextual information for speech/non-speech detection. In Text, Speech and Dialogue, volume 5246 of Series of Lecture Notes In Artificial Intelligence (LNAI), pages 451–459. Springer-Verlag Berlin, Heidelberg, 2008. [ bib | .pdf | Abstract ]
[345] E. Kokiopoulou, S. Pirillos, and P. Frossard. Graph-based classification for multiple observations of transformed patterns. IEEE Int. Conf. Pattern Recognition (ICPR), 2008. [ bib ]
[346] H. Hung and D. Gatica-Perez. Identifying dominant people in meetings from audio-visual sensors. In Proc. IEEE Int. Conf. on Automatic Face and Gesture Recognition (FG), Special Session on Multi-Sensor HCI for Smart Environments, Amsterdam, 2008. [ bib ]
[347] A. Schlapbach, H. Bunke, and F. Wettstein. Estimating the readability of handwritten text – a support vector regression based approach. In Proc. 19th Int. Conf. on Pattern Recognition. IEEE, 2008. [ bib ]
[348] A. Humm, J. Hennebert, and R. Ingold. Spoken signature for user authentication. SPIE Journal of Electronic Imaging, 17, 2008. [ bib ]
[349] K. Kryszczuk and A. Drygajlo. On quality of quality measures for classification. pages 19–28, Heidelberg, 2008. Springer. [ bib ]
[350] W. Li, K. Kumatani, J. Dines, M. Magimai-Doss, and H. Bourlard. A neural network based regression approach for recogninizing simultaneous speech. In Joint Workshop on Machine Learning and Multimodal Interaction, 2008. [ bib ]
[351] R. Chavarriaga, F. Galán, and J. del R. Millán. Asynchronous detection and classification of oscillatory brain activity. In 16 European Signal Processing Conference (EUSIPCO 2008), 2008. IDIAP-RR 08-36. [ bib | Abstract ]
[352] N. Scaringella. Timbre and rhythmic trap-tandem features for music information retrieval. In "Int. Conf. on Music Information Retrieval (ISMIR)", 2008. [ bib | .pdf | Abstract ]
[353] G. S. V. S. Sivaram and H. Hermansky. Introducing temporal asymmetries in feature extraction for automatic speech recognition. In Interspeech 2008, 2008. IDIAP-RR 08-25. [ bib | Abstract ]
[354] G. Aradilla, H. Bourlard, and M. Magimai-Doss. Posterior features applied to speech recognition tasks with limited training data. Idiap-RR Idiap-RR-15-2008, IDIAP, 2008. [ bib | Abstract ]
[355] F. Galán, M. Nuttin, D. Vanhooydonck, E. Lew, P. W. Ferrez, J. Philips, and J. del R. Millán. Continuous brain-actuated control of an intelligent wheelchair by human eeg. In 4th International Brain-Computer Interface Workshop & Training Course, 2008. IDIAP-RR 08-53. [ bib | Abstract ]
[356] S. Y. Zhao and N. Morgan. Multi-stream spectro-temporal features for robust speech recognition. In 9th International Conference of the ISCA (Interspeech 2008), Brisbane, Australia, pages 898–901, 2008. [ bib ]
[357] S. Ganapathy, P. Motlicek, H. Hermansky, and H. Garudadri. Temporal masking for bit-rate reduction in audio codec based on frequency domain linear prediction. In IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), pages 4781–4784, 2008. IDIAP-RR 07-48. [ bib | DOI | Abstract ]
[358] A. Faria and N. Morgan. When a mismatch can be good: large vocabulary speech recognition trained with idealized tandem features. Proceedings of the ACM Symposium on Applied Computing, Fortaleza, Brazil, 2008. [ bib ]
[359] A. Faria and N. Morgan. Corrected tandem features for acoustic model training. accepted for IEEE ICASSP, Las Vegas, NV, 2008. [ bib ]
[360] A. Vinciarelli, M. Pantic, H. Bourlard, and A. Pentland. Social signal processing: state-of-the-art and future perspectives of an emerging domain. In Proceedings of the ACM International Conference on Multimedia, 2008. [ bib | Abstract ]
[361] F. Fleuret, J. Berclaz, R. Lengagne, and P. Fua. Multi-camera people tracking with a probabilistic occupancy map. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(2):267–282, 2008. [ bib | Abstract ]
[362] G. S. V. S. Sivaram and H. Hermansky. Emulating temporal receptive fields of auditory mid-brain neurons for automatic speech recognition. In Proc. 16th European Signal Processing Conference (EUSIPCO), 2008. IDIAP-RR 08-24. [ bib | Abstract ]
[363] R. Bertolami and H. Bunke. Integration of n-gram language models in multiple classifier systems for offline handwritten text line recognition. Int. Journal of Pattern Recognition and Art. Intelligence, 22(7):1301–1321, 2008. [ bib ]
[364] U. Hoffmann, A. Yazdani, J. M. Vesin, and T. Ebrahimi. Bayesian feature selection applied in a p300 brain- computer interface. In 16th European Signal Processing Conference, 2008. [ bib | http | Abstract ]
[365] F. Galán, M. Nuttin, E. Lew, P. W. Ferrez, G. Vanacker, J. Philips, and J. del R. Millán. A brain-actuated wheelchair: asynchronous and non-invasive brain-computer interfaces for continuous control of robots. Clinical Neurophysiology, (119):2159–2169, 2008. [ bib | Abstract ]
[366] M. Rigamonti. A framework for structuring multimedia archives and for browsing efficiently through multimodal links. PhD thesis, University of Fribourg, Switzerland, 2008. [ bib ]
[367] F. Dufaux and T. Ebrahimi. H.264/avc video scrambling for privacy protection. In IEEE International Conference on Image Processing (ICIP2008), 2008. [ bib | Abstract ]
[368] T. Tommasi, F. Orabona, and B. Caputo. Discriminative cue integration for medical image annotation. Pattern Recognition Letters, 2008. Special Issue on Automatic Annotation of Medical Images (ImageCLEF 2007, in Press. [ bib | .pdf | Abstract ]
[369] A. Schlapbach, M. Liwicki, and H. Bunke. A writer identification system for on-line whiteboard data. Pattern Recognition, 41:2381–2397, 2008. [ bib ]
[370] M. Rayner, N. Tsourakis, M. Georgescul, and P. Bouillon. Building mobile spoken dialogue applications using regulus. In European Language Resources Association (ELRA), editor, Proceedings of the Sixth International Language Resources and Evaluation (LREC'08), Marrakech, Morocco, 2008. [ bib | Abstract ]
[371] I. Bogdanova, A. Bur, and H. Hügli. Visual attention on the sphere [in press]. IEEE Transactios on Image Processing, 2008. [ bib ]
[372] S. Ba and J. M. Odobez. Multi-party focus of attention recognition in meetings from head pose and multimodal contextual cues. In IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), Las-Vegas, 2008. [ bib | Abstract ]
[373] A. Nijholt, D. Tan, B. Allison, J. del R. Millán, M. Moore, and B. Graimann. Brain-computer interfaces for hci and games. In Proceedings of the 26th Annual CHI Conference on Human Factors in Computing Systems, Extended Abstracts, Florence, Italy, 2008. [ bib | Abstract ]
[374] K. Riedhammer, D. Gillick, B. Favre, and D. Hakkani-Tur. Packing the meeting summarization knapsack. to appear in Proceedings of Interspeech 2008, Brisbane, Australia, 2008. [ bib ]
[375] D. Grangier and S. Bengio. A discriminative kernel-based model to rank images from text queries. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2008. [ bib | Abstract ]
[376] M. Sorci, G. Antonini, B. Cerretani, J. Cruz Mota, T. Rubin, M. Bierlaire, and J. Ph. Thiran. Modelling human perception of static facial expressions. In Face and Gesture Recognition 2008, 2008. [ bib | http | Abstract ]
[377] J. Berclaz, F. Fleuret, and P. Fua. Principled detection-by-classification from multiple views. In proceedings of the International Conference on Computer Vision Theory and Applications, volume 2, pages 375–382, 2008. [ bib | Abstract ]
[378] A. Thomas, S. Ganapathy, and H. Hermansky. Hilbert envelope based spectro-temporal features for phoneme recognition in telephone speech. In Interspeech 2008, 2008. IDIAP-RR 08-18. [ bib | Abstract ]
[379] M. Pronobis and M. Magimai-Doss. Integrating audio and vision for robust automatic gender recognition. Idiap-RR Idiap-RR-73-2008, Idiap, 2008. [ bib | .pdf | Abstract ]
[380] M. Rigamonti. A framework for structuring multimedia archives and for browsing efficiently through multimodal links. PhD thesis, University of Fribourg, Switzerland, 2008. [ bib ]
[381] O. Koval, S. Voloshynovskiy, F. Beekhof, and T. Pun. Security analysis of robust perceptual hashing. In E. J. Delp III, P. W. Wong, J. Dittmann, and N. D. Memon, editors, Steganography, and Watermarking of Multimedia Contents X, volume 6819 of Proceedings of SPIE, (SPIE, Bellingham, WA 2008) 681906, 2008. [ bib ]
[382] M. Liwicki and H. Bunke. Combining on-line and off-line blstm networks for handwritten text line recognition. In Proc. 11th Int. Conf. on Frontiers in Handwriting Recognition, pages 31–36, 2008. [ bib ]
[383] B. Dumas, D. Lalanne, and R. Ingold. Demonstration : hephaistk, une boîte à outils pour le prototypage d'interfaces multimodales. In Proceedings of 20e Conférence sur l'Interaction Homme-Machine (IHM 08), pages 215–216, Metz (France), 2008. [ bib ]
[384] G. Bologna, B. Deville, M. Vinckenbosch, and T. Pun. Pairing colored socks and following a red serpentine with sounds of musical instruments. In ICAD 08, International Conference on Auditory Displays, Paris, France, June 24–27, 2008. [ bib ]
[385] C. Wooters and M. Huijbregts. The icsi rt07s speaker diarization system. In Multimodal Technologies for Perception of Humans. Lecture Notes in Computer Science, 2008. [ bib ]
[386] J. F. Paiement, S. Bengio, and D. Eck. Probabilistic models for melodic prediction. Idiap-RR Idiap-RR-50-2008, IDIAP, 2008. Submitted for publication. [ bib | .ps.gz | .pdf | Abstract ]
[387] J. F. Paiement, Y. Grandvalet, and S. Bengio. Predictive models for music. Idiap-RR Idiap-RR-51-2008, IDIAP, 2008. Submitted for publication. [ bib | .ps.gz | .pdf | Abstract ]
[388] J. Berclaz, F. Fleuret, and P. Fua. Multi-camera tracking and atypical motion detection with behavioral maps. In Proceedings of the European Conference on Computer Vision (ECCV), pages 112–125, 2008. [ bib ]
[389] T. Varga and H. Bunke. Perturbation models for generating synthetic training data in handwriting recognition. In S. Marinai and H. Fujisawa, editors, Machine Learning in Document Analysis and Recognition, pages 333–360. Springer, 2008. [ bib ]
[390] J. Yao and J. M. Odobez. Fast human detection from videos using covariance features. In European Conference on Computer Vision, workshop on Visual Surveillance (ECCV-VS), 2008. [ bib | .pdf | Abstract ]
[391] D. Vergyri, A. Mandal, W. Wang, A. Stolcke, J. Zheng, M. Graciarena, D. Rybach, C. Gollan, R. Schlater, K. Kirchoff, A. Faria, and N. Morgan. Development of the sri/nightingale arabic asr system. to appear in Proceedings of Interspeech 2008, Brisbane, Australia, 2008. [ bib ]
[392] J. Berclaz, F. Fleuret, and P. Fua. Multi-camera tracking and atypical motion detection with behavioral maps. In The 10th European Conference on Computer Vision (ECCV 2008), 2008. [ bib | Abstract ]
[393] G. Aradilla. Acoustic models for posterior features in speech recognition. PhD thesis, Ecole Polytechnique Fédérale de Lausanne, Lausanne , Switzerland, 2008. PhD Thesis no 4164. [ bib | Abstract ]
[394] A. Vinciarelli, M. Pantic, H. Bourlard, and A. Pentland. Social signals, their function, and automatic analysis: a survey. In Proceedings of International Conference on Multimodal Interfaces (to appear), 2008. [ bib | Abstract ]
[395] J. F. Paiement, Y. Grandvalet, S. Bengio, and D. Eck. A distance model for rhythms. In 25th International Conference on Machine Learning (ICML), 2008. IDIAP-RR 08-33. [ bib | .ps.gz | .pdf | Abstract ]
[396] D. Grangier. Machine Learning for Information Retrieval. PhD thesis, École Polytechnique Fédérale de Lausanne, 2008. Thèse Ecole polytechnique fédérale de Lausanne EPFL, no 4088 (2008), Faculté des sciences et techniques de l'ingénieur STI, Section de génie électrique et électronique, Institut de génie électrique et électronique IEL (Laboratoire de l'IDIAP LIDIAP). Dir.: Hervé Bourlard, Sami Bengio. [ bib | .pdf | Abstract ]
[397] D. Jayagopi, H. Hung, C. Yeo, and D. Gatica-Perez. Predicting the dominant clique in meetings through fusion of nonverbal cues. In ACM MM 2008, 2008. IDIAP-RR 08-08. [ bib | Abstract ]
[398] B. Leibe, A. Leonardis, and B. Schiele. Robust object detection with interleaved categorization and segmentation. International Journal of Computer Vision, 77(1-3):259–289, 2008. [ bib ]
[399] J. Kludas, S. Marchand-Maillet, and E. Bruno. Exploiting document feature interactions for efficient information fusion in high dimensional spaces. In Proceedings of the First International Workshops on Image Processing Theory, Tools and Applications (IPTA'2008), Sousse, Tunisia, 2008. (invited). [ bib | .pdf ]
[400] A. Rakotomamonjy, F. Bach, S. Canu, and Y. Grandvalet. Simplemkl. Journal of Machine Learning Research, 9:2491–2521, 2008. [ bib | .pdf | Abstract ]
[401] M. Soleymani, G. Chanel, J. Kierkels, and T. Pun. affective ranking of movie scenes using physiological signals and content analysis. In 2nd ACM Workshop on the Many Faces of Multimedia Semantics, ACM MM08, Vacnouver, Canada, 2008. [ bib ]
[402] A. Schlapbach. Writer identification and verification, volume 311. IOS Press, 2008. [ bib ]
[403] S. Ganapathy, S. Thomas, and H. Hermansky. Modulation frequency features for phoneme recognition in noisy speech. Journal of Acoustical Society of America - Express Letters, 2008. [ bib | .pdf | Abstract ]
[404] M. Soleymani, G. Chanel, J. Kierkels, and T. Pun. affective characterization of movie scenes based on multimedia content analysis and user's physiological emotional responses. In IEEE International Symposium on Multimedia, Berkeley, US, 2008. [ bib ]
[405] H. Hung, Y. Huang, C. Yeo, and D. Gatica-Perez. Associating audio-visual activity cues in a dominance estimation framework. In CVPR Workshop on Human Communicative Behavior, Ankorage, 2008. [ bib ]
[406] K. Kryszczuk and A. Drygajlo. On quality of quality measures for classification. In Biometrics and Identity Management, Lecture Notes in Computer Science 5372, pages 19–28, Heidelberg, 2008. [ bib ]
[407] K. Kryszczuk and A. Drygajlo. What do quality measures predict in biometrics. In 16th European Signal Processing Conference, Lausanne, Switzerland, 2008. [ bib ]
[408] M. Soleymani, G. Chanel, J. Kierkels, and T. Pun. valence-arousal representation of movie scenes based on multimedia content analysis and user's physiological emotional responses. 5th Joint Workshop on Machine Learning and Multimodal Interaction, 2008. [ bib ]
[409] D. Gillick, D. Hakkani-Tur, and M. Levit. Unsupervised learning of edit parameters for matching name variants. to appear in Proceedings of Interspeech 2008, Brisbane, Australia, 2008. [ bib ]
[410] A. Pronobis, O. Martinez Monos, and B. Caputo. Svm-based discriminative accumulation scheme for place recognition. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA08), 2008. [ bib | .pdf | Abstract ]
[411] A. Torii, M. Havlena, T. Pajdla, and B. Leibe. Measuring camera translation by the dominant apical angle. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08), 2008. [ bib ]
[412] A. Ess, B. Leibe, K. Schindler, and L. van Gool. A mobile vision system for robust multi-person tracking. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08), 2008. [ bib ]
[413] N. Cornelis, B. Leibe, K. Cornelis, and L. van Gool. 3d urban scene modeling integrating recognition and reconstruction. International Journal of Computer Vision, 78(2-3):121–141, 2008. [ bib ]
[414] T. Weise, B. Leibe, and L. van Gool. Accurate and robust registration for in-hand modeling. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08), 2008. [ bib ]
[415] K. Schindler, L. van Gool, and B. de Gelder. Recognizing emotions expressed by body pose: a biologically inspired neural model. Neural Networks, 2008. [ bib ]
[416] J. Kludas, E. Bruno, and S. Marchand-Maillet. Exploiting synergistic and redundant features for multimedia document classification. In 32nd Annual Conference of the German Classification Society - Advances in Data Analysis, Data Handling and Business Intelligence (GfKl 2008), Hamburg, Germany, 2008. [ bib ]
[417] R. A. Negoescu and D. Gatica-Perez. Analyzing flickr groups. In Proceedings of the 2008 international conference on Content-based image and video retrieval (CIVR '08), number Idiap-RR-03-2008, 2008. To appear in Proceedings of CIVR'08. [ bib | Abstract ]
[418] G. Aradilla, H. Bourlard, and M. Magimai-Doss. Using kl-based acoustic models in a large vocabulary recognition task. Idiap-RR Idiap-RR-14-2008, IDIAP, 2008. [ bib | Abstract ]
[419] S. Voloshynovskiy, O. Koval, R. Villán, F. Beekhof, and T. Pun. Authentication of biometric identification documents via mobile devices. Journal of Electronic Imaging, 2008. [ bib ]
[420] G. Garipelli, R. Chavarriaga, and J. del R. Millan. Recognition of anticipatory behavior from human eeg. In 4th Intl. Brain-Computer Interface Workshop and Training Course. Graz University, Austria, 2008. IDIAP-RR 08-52. [ bib | Abstract ]
[421] I. Bogdanova, A. Bur, and H. Hügli. The spherical approach to omnidirectional visual attention. In XVI European Signal Processing Conference (EUSIPCO 2008), Proc. EUSIPCO, 2008. [ bib ]
[422] P. Prodanov, A. Drygajlo, J. Richiardi, and A. Alexander. Low-level grounding in a multimodal mobile service robot conversational system using graphical models. Intelligent Service Robotics, 1:3–26, 2008. [ bib | DOI ]
[423] J. Mariéthoz, S. Bengio, and Y. Grandvalet. Kernel based text-independnent speaker verification. Idiap-RR Idiap-RR-68-2008, Idiap, 2008. [ bib | .pdf ]
[424] D. Weinshall, H. Hermansky, A. Zweig, J. Luo, H. Jimison, F. Ohl, and M. Pavel. Beyond novelty detection: Incongruent events, when general and specific classifiers disagree. In Advances in Neural Information Processing Systems 21, 2008. [ bib | .pdf | Abstract ]
[425] S. Ganapathy, P. Motlicek, and H. Hermansky. Modified discrete cosine transform for encoding residual signals in frequency domain linear prediction. Idiap-RR Idiap-RR-74-2008, Idiap, 2008. [ bib | .pdf ]
[426] J. P. Pinto, I. Szoke, S. R. Mahadeva Prasanna, and H. Hermansky. Fast approximate spoken term detection from sequence of phonemes. In The 31st Annual International ACM SIGIR Conference 20-24 July 2008, 31st International ACM SIGIR Conference, pages 28–33, 2008. IDIAP-RR 08-45. [ bib | Abstract ]
[427] R. Sala Llonch, E. Kokiopoulou, I. Tosic, and P. Frossard. 3d face recognition using sparse spherical representations. IEEE Int. Conf. Pattern Recognition (ICPR), 2008. [ bib ]
[428] S. Stoyanchev, G. Tur, and D. Hakkani-Tur. Name-aware speech recognition for interactive question answering. IEEE ICASSP, Las Vegas, NV, 2008. [ bib ]
[429] B. Deville, G. Bologna, M. Vinckenbosch, and T. Pun. guiding the focus of attention of blind people with visual saliency. In Workshop on Computer Vision Applications for the Visually Impaired (CVAVI 08), 2008. [ bib ]
[430] T. Tommasi, F. Orabona, and B. Caputo. Clef2008 image annotation task: an svm confidence-based approach. Idiap-RR Idiap-RR-77-2008, Idiap, 2008. CLEF 2008 Working Notes. [ bib | .pdf | Abstract ]
[431] A. Popescu-Belis, E. Boertjes, J. Kilgour, P. Poller, S. Castronovo, T. Wilson, A. Jaimes, and J. Carletta. The amida automatic content linking device: Just-in-time document retrieval in meetings. In A. Popescu-Belis and R. Stiefelhagen, editors, Machine Learning for Multimodal Interaction V, volume 5237 of LNCS, pages 272–283. Springer-Verlag, 2008. [ bib | DOI | .pdf | Abstract ]
[432] E. Kokiopoulou, P. Frossard, and D. Gkorou. Optimal polynomial filtering for accelerating distributed consensus. IEEE Int. Symp. on Information Theory (ISIT), 2008. [ bib ]
[433] H. Hung, Y. Huang, G. Friedland, and D. Gatica-Perez. Estimating the dominant person in multi-party conversations using speaker diarization strategies. In ICASSP 08, 2008. [ bib ]
[434] A. Popescu-Belis, H. Bourlard, and S. Renals. Machine learning for multimodal interaction iv (revised selected papers from mlmi 2007, brno, 28-30 june 2007). LNCS 4892. Springer-Verlag, Berlin/Heidelberg, 2008. [ bib ]
[435] A. Popescu-Belis and R. Stiefelhagen. Machine learning for multimodal interaction v (proceedings of mlmi 2008, utrecht, 8-10 september 2008). LNCS 5237. Springer-Verlag, Berlin/Heidelberg, 2008. [ bib ]
[436] D. Gatica-Perez and K. Farrahi. What did you do today? discovering daily routines from large-scale mobile data. In ACM International Conference on Multimedia (ACMMM), 2008. IDIAP-RR 08-49. [ bib | Abstract ]
[437] F. Valente and H. Hermansky. Hierarchical and parallel processing of modulation spectrum for asr applications. In IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), pages 4165–4168, 2008. IDIAP-RR 07-45. [ bib | DOI | Abstract ]
[438] B. Dumas, D. Lalanne, D. Guinard, R. Koenig, and R. Ingold. Strengths and weaknesses of software architectures for the rapid creation of tangible and multimodal interfaces. pages 47–54, 2008. [ bib ]
[439] H. Bourlard and S. Renals. Recognition and understanding of meetings overview of the european ami and amida projects. In LangTech 2008, 2008. IDIAP-RR 08-27. [ bib | Abstract ]
[440] B. Dumas, D. Lalanne, and R. Ingold. Prototyping multimodal interfaces with smuiml modeling language. pages 63–66, 2008. [ bib ]
[441] J. Kludas, E. Bruno, and S. Marchand-Maillet. Can feature information interaction help for information fusion in multimedia problems? To appear in Multimedia Tools and Applications Journal special issue on "Metadata Mining for Image Understanding", 2008. [ bib ]
[442] N. Tsourakis, A. Lisowska, P. Bouillon, and M. Rayner. From desktop to mobile: adapting a successful voice interaction platform for use in mobile devices. In Third ACM MobileHCI Workshop on Speech in Mobile and Pervasive Environments (SiMPE), 2008. 2nd-5th, September. [ bib ]
[443] A. Popescu-Belis, P. Baudrion, M. Flynn, and P. Wellner. Towards an objective test for meeting browsers: the bet4tqb pilot experiment. In A. Popescu-Belis, H. Bourlard, and S. Renals, editors, Machine Learning for Multimodal Interaction IV, LNCS 4892, pages 108–119. Springer-Verlag, Berlin/Heidelberg, 2008. [ bib | DOI ]
[444] A. Schlapbach, F. Wettstein, and H. Bunke. Automatic estimation of the readability of handwritten text. In Proc. 16th European Signal Processing Conference, 2008. [ bib ]
[445] A. Popescu-Belis, E. Boertjes, J. Kilgour, P. Poller, S. Castronovo, T. Wilson, A. Jaimes, and J. Carletta. The amida automatic content linking device: just-in-time document retrieval in meetings. In A. Popescu-Belis and R. Stiefelhagen, editors, Machine Learning for Multimodal Interaction V (Proceedings of MLMI 2008, Utrecht, 8-10 September 2008), LNCS 5237, pages 273–284. Springer-Verlag, Berlin/Heidelberg, 2008. [ bib ]
[446] L. Dollé, M. Khamassi, B. Girard, A. Guillot, and R. Chavarriaga. Analyzing interactions between navigation strategies using a computational model of action selection. In Spatial Cognition 2008 (SC '08), Lecture Notes in Computer Science, pages 71–86, 2008. IDIAP-RR 08-48. [ bib | Abstract ]
[447] G. Gonzalez, F. Fleuret, and P. Fua. Automated delineation of dendritic networks in noisy image stacks. In The 10th European Conference on Computer Vision, 2008. [ bib ]
[448] K. Kumatani, J. McDonough, S. Schacht, D. Klakow, P. N. Garner, and W. Li. Filter bank design for subband adaptive beamforming and application to speech recognition. Idiap-RR Idiap-RR-02-2008, IDIAP, 2008. [ bib | .ps.gz | .pdf | Abstract ]
[449] J. Anemuller, J. H. Back, B. Caputo, M. Havlena, J. Luo, H. Kayser, B. Leibe, P. Motlicek, T. Pajdla, M. Pavel, A. Torii, L. van Gool, A. Zweig, and H. Hermansky. The dirac awear audio-visual platform for detection of unexpected and incongruent events. In Proceedings of the International Conference on Multimodal Interfaces, 2008. [ bib | .pdf | Abstract ]
[450] K. Kumatani, J. McDonough, D. Klakow, P. N. Garner, and W. Li. Maximum negentropy beamforming. Idiap-RR Idiap-RR-07-2008, IDIAP, 2008. [ bib | .ps.gz | .pdf ]
[451] G. Friedland and O. Vinyals. Live speaker identification in conversations. In ACM Multimedia 2008, Vancouver, Canada, pages 1017–1018, 2008. [ bib ]
[452] H. Hung and G. Friedland. Towards audio-visual on-line diarization of participants in group meetings. In European Conference on Computer Vision (ECCV) 2008, Marseille, France, 2008. [ bib ]
[453] O. Vinyals and G. Friedland. A hardware-independent fast logarithm approximation with adjustable accuracy. In 10th IEEE International Symposium on Multimedia, Berkeley, CA, USA, pages 61–65, 2008. [ bib ]
[454] O. Vinyals and G. Friedland. Modulation spectrogram features for speaker diarization. In Interspeech 2008, Brisbane, Australia, pages 630–633, 2008. [ bib ]
[455] K. Boakye, O. Vinyals, and G. Friedland. Two's a crowd: improving speaker diarization by automatically identifying and excluding overlapped speech. In Interspeech 2008, Brisbane, Australia, pages 32–35, 2008. [ bib ]
[456] F. Fleuret and D. Geman. Stationary features and cat detection. Journal of Machine Learning Research (JMLR), 9:2549–2578, 2008. [ bib ]
[457] A. Thomas, S. Ganapathy, and H. Hermansky. Hilbert envelope based features for far-field speech recognition. In MLMI 2008. Utrecht, The Netherlands, 2008. IDIAP-RR 08-42. [ bib | Abstract ]
[458] M. Soleymani, J. Kierkels, G. Chanel, E. Bruno, S. Marchand-Maillet, and T. Pun. Estimating emotions and tracking interest during movie watching based on multimedia content and physiological responses. In Joint (IM)2-Interactive Multimodal Information Management and Affective Sciences NCCRs meeting, Riederalp, Switzerland, 2008. [ bib ]
[459] P. W. Ferrez and J. del R. Millán. Simultaneous real-time detection of motor imagery and error-related potentials for improved bci accuracy. In Proceedings of the 4th International Brain-Computer Interface Workshop and Training Course, 2008. [ bib | Abstract ]
[460] B. Leibe, A. Ettlin, and B. Schiele. Learning semantic object parts for object categorization. Image and Vision Computing, 26(1):15–26, 2008. [ bib ]
[461] D. Vijayasenan, F. Valente, and H. Bourlard. Integration of tdoa features in information bottleneck framework for fast speaker diarization. In Interspeech 2008, 2008. IDIAP-RR 08-26. [ bib | .ps.gz | .pdf | Abstract ]
[462] A. Popescu-Belis, M. Flynn, P. Wellner, and P. Baudrion. Task-based evaluation of meeting browsers: from bet task elicitation to user behavior analysis. In LREC 2008 (6th International Conference on Language Resources and Evaluation), Marrakech, Morocco, 2008. [ bib | Abstract ]
[463] P. Estrella, A. Popescu-Belis, and M. King. Improving contextual quality models for mt evaluation based on evaluators' feedback. In LREC 2008 (6th International Conference on Language Resources and Evaluation), Marrakech, Morocco, 2008. [ bib | Abstract ]
[464] F. De Simone, M. Ansorge, and T. Ebrahimi. A multi-channel objective model for the full-reference assessment of color pictures. In 2nd K-space Jamboree Workshop, 2008. [ bib | http | Abstract ]
[465] R. Bertolami, C. Gutmann, L. Spitz, and H. Bunke. Shape code based lexicon reduction for offline handwriting recognition. In Proc. 8th IAPR Int. Workshop on Document Analysis Systems, pages 158–163, 2008. [ bib ]
[466] D. Vijayasenan, F. Valente, and H. Bourlard. Combination of agglomerative and sequential clustering for speaker diarization. In International Conference on Acoustics, Speech and Signal Processing, 2008. [ bib ]
[467] F. Orabona, J. Keshet, and B. Caputo. The projectron: a bounded kernel-based perceptron. In Int. Conf. on Machine Learning, 2008. IDIAP-RR 08-30. [ bib | .ps.gz | .pdf | Abstract ]
[468] A. Popescu-Belis, H. Bourlard, and S. Renals. Machine learning for multimodal interaction iv, volume 4892 of LNCS. Springer-Verlag, Berlin/Heidelberg, 2008. http://www.springeronline.com/978-3-540-78154-7. [ bib | Abstract ]
[469] T. Dutoit, L. Couvreur, and H. Bourlard. How does a dictation machine recognize speech ? In Applied Signal Processing–A MATLAB approach, chapter 4, pages 104–148. Springer MA, 2008. [ bib | .pdf ]
[470] D. Gatica-Perez and K. Farrahi. Discovering human routines from cell phone data with topic models. In IEEE International Symposium on Wearable Computers (ISWC), 2008. IDIAP-RR 08-32. [ bib | Abstract ]
[471] K. Schindler and D. Suter. Object detection by global contour shape. Pattern Recognition, 2008. [ bib ]
[472] G. Garipelli, R. Chavarriaga, and J. del R. Millán. Fast recognition of anticipation related potentials. IEEE Transactions on Biomedical Engineering, 2008. In press. [ bib | Abstract ]
[473] H. Bourlard, R. Chavarriaga, F. Galán, and J. del R. Millán. Characterizing the eeg correlates of exploratory behavior. IEEE Transactions on Neural Systems & Rehabilitation Engineering, 2008. IDIAP-RR 08-28. [ bib | Abstract ]
[474] W. Li. Effective post-processing for single-channel frequency-domain speech enhancement. Number Idiap-RR-71-2007, pages 149–152, 2008. Submitted for publication. [ bib | DOI | Abstract ]
[475] S. Ba and J. M. Odobez. Multi-person visual focus of attention from head pose and meeting contextual cues. Technical Report 47, IDIAP Research Report 47, submitted to the IEEE Transactions on Pattern Analysis and Machine Intelligence, second revision, 2008. [ bib ]
[476] K. Kryszczuk and A. Drygajlo. What do quality measures predict in biometrics. pages –,–29, Lausanne, 2008. [ bib ]
[477] A. Shahrokni, T. Drummond, F. Fleuret, and P. Fua. Classification-based probabilistic modeling of texture transition for fast line search tracking and delineation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008. [ bib ]
[478] R. Bertolami and H. Bunke. Ensemble methods to improve the performance of an english handwritten text line recognizer. In D. Doerman and S. Jaeger, editors, Arabic and Chinese Handwriting Recognition, LNCS 4768, pages 265–277. Springer, 2008. [ bib ]
[479] J. Keshet and S. Bengio. Automatic speech and speaker recognition: large margin and kernel methods. John Wiley & Sons, 2008. [ bib | Abstract ]
[480] K. Boakye, B. Trueba-Hornero, O. Vinyals, and G. Friedland. Overlapped speech detection for improved speaker diarization in multiparty meetings. In International Conference on Acoustics, Speech, and Signal Processing, 2008. [ bib ]
[481] D. Jayagopi, S. Ba, J. M. Odobez, and D. Gatica-Perez. Predicting two facets of social verticality in meetings from five-minute time slices and nonverbal cues. In Proc. Int. Conf. on Multimodal Interfaces (ICMI), Special Session on Social Signal Processing, Chania, 2008. [ bib ]
[482] J. del R. Millán, P. W. Ferrez, F. Galán, E. Lew, and R. Chavarriaga. Non-invasive brain-machine interaction. International Journal of Pattern Recognition and Artificial Intelligence, 2008. [ bib | Abstract ]
[483] A. Thomas, V. Ferrari, B. Leibe, T. Tuytelaars, and L. van Gool. Using recognition to guide a robot's attention. In Robotics Science and Systems, 2008. in press. [ bib ]
[484] K. Schindler and L. van Gool. Action snippets: how many frames does human action recognition require? In IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08). IEEE Press, 2008. [ bib ]
[485] E. Indermühle, M. Liwicki, and H. Bunke. Recognition of handwritten historical documents: hmm -adaptation vs. writer specific training. In Proc. 11th Int. Conf. on Frontiers in Handwriting Recognition, pages 186–191, 2008. [ bib ]
[486] D. Jayagopi, B. Raducanu, and D. Gatica-Perez. Characterizing conversational group dynamics using nonverbal behavior. In Proc. IEEE Int. Conf. on Multimedia (ICME), NewYork, 2008. [ bib ]
[487] B. Caputo. Class specific object recognition using kernel gibbs distributions. ELectronic Letters on Computer vision and Image Analysis, 7(2):96–109, 2008. Special Issue on Computational Modelling of Objects Represented in Images. [ bib | .pdf ]
[488] D. Morrison, S. Marchand-Maillet, and E. Bruno. Semantic clustering of images using patterns of relevance feedback. In Proceedings of the 6th International Workshop on Content-based Multimedia Indexing (CBMI'2008), London, UK, 2008. [ bib ]
[489] D. Grandjean and T. Pun, editors. Multimodality in emotions and for their assessment, 2008. Workshop at Joint (IM)2-Interactive Multimodal Information Management and Affective Sciences NCCRs meeting. [ bib ]
[490] D. Lalanne, M. Rigamonti, R. Ingold, F. Evéquoz, and B. Dumas. An ego-centric and tangible approach to meeting indexing and browsing, volume Volume 4892 of Lecture Notes in Computer Science. Springer Berlin / Heidelberg, computer science edition, 2008. [ bib | DOI | Abstract ]
[491] J. Luo, B. Caputo, A. Zweig, J. H. Back, and J. Anemuller. Object category detection using audio-visual cues. In International Conference on Computer Vision Systems (ICVS08), Santorini, Greece, 2008. [ bib | Abstract ]
[492] K. Kumatani, J. McDonough, B. Rauch, D. Klakow, P. N. Garner, and W. Li. Beamforming with a maximum negentropy criterion. IEEE Transactions on Audio Speech and Language Processing, 17(5):994–1008, 2008. [ bib | .pdf | Abstract ]
[493] G. Zeng and L. van Gool. Multi-label image segmentation via point-wise repetition. In International Conference on Computer Vision and Pattern Recognition (CVPR), 2008. [ bib ]
[494] S. Gammeter, A. Ess, T. Jaeggli, B. Leibe, K. Schindler, and L. van Gool. Articulated multibody tracking under egomotion. In European Conference on Computer Vision (ECCV'08), LNCS. Springer, 2008. in press. [ bib ]
[495] T. Quack, B. Leibe, and L. van Gool. World-scale mining of objects and events from community photo collections. In Conference on Image and Video Retrieval (CIVR'08). ACM, 2008. [ bib ]
[496] B. Leibe, K. Schindler, N. Cornelis, and L. van Gool. Coupled object detection and tracking from static cameras and moving vehicles. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008. [ bib ]
[497] S. Pellegrini, K. Schindler, and D. Nardi. A generalization of the icp algorithm for articulated bodies. In M. Everingham and C. Needham, editors, British Machine Vision Conference (BMVC'08), 2008. [ bib ]
[498] M. Szafranski, Y. Grandvalet, and A. Rakotomamonjy. Composite kernel learning. In A. McCallum and S. Roweis, editors, Proceedings of the 25th Annual International Conference on Machine Learning (ICML 2008), pages 1040–1047. Omnipress, 2008. IDIAP-RR 08-59. [ bib | .pdf | Abstract ]
[499] S. Voloshynovskiy, O. Koval, F. Beekhof, and T. Pun. Multimodal authentication based on random projections and distributed coding. In MM&Sec 2008, 2008. [ bib ]
[500] S. Ganapathy, P. Motlicek, H. Hermansky, and H. Garudadri. Autoregressive modelling of hilbert envelopes for wide-band audio coding. In AES 124th Convention, Audio Engineering Society, 2008. IDIAP-RR 08-40. [ bib | Abstract ]
[501] J. Kludas, E. Bruno, and S. Marchand-Maillet. Can feature information interaction help for information fusion in multimedia problems? In First International Workshop on Metadata Mining for Image Understanding, pages 23–33, Funchal, Madeira, 2008. [ bib ]
[502] N. Garg and D. Hakkani-Tur. Speaker role detection in meetings using lexical information and social network analysis. Technical Report TR-08-004, International Computer Science Institute, Berkeley, CA, 2008. [ bib ]
[503] A. Humm, J. Hennebert, and R. Ingold. Combined handwriting and speech modalities for user authentication. IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans, 38, 2008. [ bib ]
[504] A. Humm, J. Hennebert, and R. Ingold. Spoken signature for user authentication. SPIE Journal of Electronic Imaging, 17, 2008. [ bib ]
[505] K. Kumatani, J. McDonough, D. Klakow, P. N. Garner, and W. Li. Adaptive beamforming with a maximum negentropy criterion,. In The Joint Workshop on Hands-free Speech Communication and Microphone Arrays, 2008. [ bib ]
[506] K. Kumatani, J. McDonough, S. Schacht, D. Klakow, P. N. Garner, and W. Li. Filter bank design based on minimization of individual aliasing terms for minimum mutual information subband adaptive beamforming. In International Conferance on Acoustics Speech and Signal Processing, 2008. [ bib ]
[507] M. van den Berg, E. Koller-Meier, and L. van Gool. Fast body posture estimation using volumetric features. In IEEE Visual Motion Computing (MOTION), 2008. [ bib ]
[508] T. Spindler, C. Wartmann, L. Hovestadt, D. Roth, L. van Gool, and A. Steffen. Privacy in video surveilled spaces. Journal of Computer Security, 16(2):199–222, 2008. [ bib ]
[509] T. Quack, H. Bay, and L. van Gool. Object recognition for the internet of things. In Internet of Things 2008, 2008. in press. [ bib ]
[510] P. N. Garner. A weighted finite state transducer tutorial. Idiap-Com Idiap-Com-03-2008, IDIAP, 2008. [ bib | Abstract ]
[511] J. del R. Millán. Brain-controlled robots. IEEE Intelligent Systems, 2008. [ bib | .pdf | Abstract ]
[512] J. Richiardi, A. Drygajlo, and L. Todesco. Promoting diversity in gaussian mixture ensembles: an application to signature verification. In Biometrics and Identity Management, Lecture Notes in Computer Science 5372, pages 140–149, Heidelberg, 2008. [ bib ]
[513] D. Gatica-Perez and K. Farrahi. Daily routine classification from mobile phone data. In Workshop on Machine Learning and Multimodal Interaction (MLMI08), 2008. IDIAP-RR 07-62. [ bib | Abstract ]
[514] K. Kamangar, D. Hakkani-Tur, G. Tur, and M. Levit. An iterative unsupervised learning method for information distillation. accepted for IEEE ICASSP, Las Vegas, NV, 2008. [ bib ]
[515] F. Camastra and A. Vinciarelli. Machine learning for audio, image and video analysis, volume XVI of Advanced Information and Knowledge Processing. Springer Verlag, theory and applications edition, 2008. [ bib | Abstract ]
[516] F. Fleuret and D. Geman. Stationary features and cat detection. Journal of Machine Learning Research, 2008. [ bib | Abstract ]
[517] H. Ketabdar and H. Bourlard. Enhanced phone posteriors for improving speech recognition systems. Idiap-RR Idiap-RR-39-2008, IDIAP, 2008. [ bib | Abstract ]
[518] J. Yao and J. M. Odobez. Multi-camera 3d person tracking with particle filter in a surveillance environment. In 16th European Signal processing Conference (EUSIPCO), 2008. [ bib | .pdf | Abstract ]
[519] A. Carreras, G. Cordara, J. Delgado, F. Dufaux, G. Francini, T. M. Ha, E. Rodriguez, and R. Tous. A search and retrieval framework for the management of copyrighted audiovisual content. In 50th International Symposium ELMAR 2008, 2008. [ bib | http | Abstract ]
[520] L. Goldmann, T. Adamek, P. Vajda, M. Karaman, R. Mörzinger, E. Galmar, T. Sikora, N. O'Connor, T. Ha-Minh, T. Ebrahimi, P. Schallauer, and B. Huet. Towards fully automatic image segmentation evaluation. In Advanced Concepts for Intelligent Vision Systems (ACIVS), Lecture Notes in Computer Science, Berlin, 2008. Springer. [ bib | http ]
[521] F. De Simone, D. Ticca, F. Dufaux, M. Ansorge, and T. Ebrahimi. A comparative study of color image compression standards using perceptually driven quality metrics. In SPIE Optics and Photonics, 2008. [ bib | Abstract ]
[522] U. Hoffmann, J. Naruniec, A. Yazdani, and T. Ebrahimi. Face detection using discrete gabor jets and color information. In SIGMAP 2008 - International Conference on Signal Processing and Multimedia Applications, 2008. [ bib | http | Abstract ]
[523] J. Kludas, E. Bruno, and S. Marchand-Maillet. Exploiting synergistic and redundant features for multimedia document classification. In 32nd Annual Conference of the German Classification Society - Advances in Data Analysis, Data Handling and Business Intelligence (GfKl 2008), Hamburg, Germany, 2008. [ bib | .pdf ]
[524] B. Dumas, D. Lalanne, and R. Ingold. Démonstration : hephaistk, une boîte à outils pour le prototypage d'interfaces multimodales. 2008. [ bib ]
[525] E. Kokiopoulou and P. Frossard. Semantic coding by supervised dimensionality reduction. IEEE Transactions on Multimedia, 10(2), 2008. [ bib ]
[526] M. Knox, N. Morgan, and N. Mirghafori. Getting the last laugh: automatic laughter segmentation in meetings. In 9th International Conference of the ISCA (Interspeech 2008), Brisbane, Australia, pages 797–800, 2008. [ bib ]
[527] S. Ganapathy, A. Thomas, and H. Hermansky. Front-end for far-field speech recognition based on frequency domain linear prediction. In Interspeech 2008, 2008. IDIAP-RR 08-17. [ bib | Abstract ]
[528] X. Perrin, R. Chavarriaga, C. Ray, R. Siegwart, and J. del R. Millán. A comparative psychophysical and eeg study of different feedback modalities for hri. In Human-Robot Interaction (HRI08), Amsterdam, 2008. [ bib | Abstract ]
[529] A. Popescu-Belis. Dimensionality of dialogue act tagsets: an empirical analysis of large corpora. Language Resources and Evaluation, 42(1):99–107, 2008. [ bib | DOI | Abstract ]
[530] M. Knox, N. Morgan, and N. Mirghafori. Getting the last laugh: automatic laughter segmentation in meetings. to appear in Proceedings of Interspeech 2008, Brisbane, Australia, 2008. [ bib ]
[531] B. Noris, K. Benmachiche, and A. Billard. Calibration-free eye gaze direction detection with gaussian processes. In International Conference on Computer Vision Theory and Applications (VISAPP 2008), 2008. [ bib | Abstract ]
[532] J. F. Paiement. Probabilistic models for music. PhD thesis, École Polytechnique Fédérale de Lausanne, 2008. Thèse Ecole polytechnique fédérale de Lausanne EPFL, no 4148 (2008), Faculté des sciences et techniques de l'ingénieur STI, Institut de génie électrique et électronique IEL (Laboratoire de l'IDIAP LIDIAP). Dir.: Hervé Bourlard, Samy Bengio. [ bib | .pdf | Abstract ]
[533] A. Schlapbach and H. Bunke. Off-line writer identification and verification using gaussian mixture models. In S. Marinai, editor, Machine Learning in Document Analysis and Recognition, pages 409–428. Springer, 2008. [ bib ]
[534] H. Hung, D. Jayagopi, S. Ba, J. M. Odobez, and D. Gatica-Perez. Investigating automatic dominance estimation in groups from visual attention and speaking activity. In Proc. ICMI, Chania, Greece, 2008. [ bib ]
[535] T. Tommasi, F. Orabona, and B. Caputo. Cue integration for medical image annotation. In Advances in Multilingual and Multimodal Information Retrieval: 8th Workshop of the Cross-Language Evaluation Forum, CLEF 2007, Budapest, Hungary, September 19-21, 2007, Revised Selected Papers, LNCS. Springer-Verlag, 2008. [ bib | .pdf | Abstract ]
[536] S. Voloshynovskiy, O. Koval, and T. Pun. Multimodal authentication based on random projections and distributed coding. In Proceedings of the 10th ACM Workshop on Multimedia & Security, Oxford, UK, 2008. [ bib ]
[537] H. Ketabdar and H. Bourlard. Hierarchical integration of phonetic and lexical knowledge in phone posterior estimation. In International Conference on Acoustics, Speech, and Signal Processing, 2008. [ bib | Abstract ]
[538] B. Schouten, N. Juul, A. Drygajlo, and M. Tistarelli. Biometrics and identity management. Springer, 2008. [ bib ]
[539] S. Favre, H. Salamin, A. Vinciarelli, D. Hakkani-Tur, and N. Garg. Role recognition for meeting participants: an approach based on lexical information and social network analysis. In ACM International Conference on Multimedia, 2008. [ bib | Abstract ]
[540] G. Chanel, C. Rebetez, M. Betrancourt, and T. Pun. boredom, engagement and anxiety as indicators for adaptation to difficulty in games. In ACM Mindtrek conference, Tampere, Finland, 2008. [ bib ]
[541] H. Hung, D. Jayagopi, S. Ba, J. M. Odobez, and D. Gatica-Perez. Investigating automatic dominance estimation in groups from visual attention and speaking activity. In International Conference on Multimodal Interfaces (ICMI), 2008. [ bib ]
[542] R. Tous, A. Carreras, J. Delgado, G. Cordara, F. Gianluca, E. Peig, F. Dufaux, and G. Galinski. An architecture for tv content distributed search and retrieval using the mpeg query format (mpqf). In International Workshop on Ambient Media Delivery and Interactive Television (AMDIT 2008), 2008. [ bib | http | Abstract ]
[543] B. Deville, G. Bologna, M. Vinckenbosch, and T. Pun. Guiding the focus of attention of blind people with visual saliency. In Workshop on Computer Vision Applications for the Visually Impaired (CVAVI 08), Satellite Workshop of theEuropean Conference on Computer Vision (ECCV 2008), Marseille, France, October 18, 2008. [ bib ]
[544] D. Jayagopi. Predicting two facets of social verticality in meetings from five-minute time slices and nonverbal cues. In Proc. ICMI, Chania, Greece, 2008. [ bib ]
[545] E. Shriberg. Higher level features in speaker recognition. in C. Muller (Ed.) Speaker Classification I. Springer-Verlag, New York, 2008. [ bib ]
[546] B. Dumas, D. Lalanne, and R. Ingold. Prototyping multimodal interfaces with smuiml modeling language. In Proceedings of CHI 2008 Workshop on UIDLs for Next Generation User Interfaces (CHI 2008 workshop), pages 63–66, Florence (Italy), 2008. [ bib ]
[547] B. Dumas, D. Lalanne, D. Guinard, R. Koenig, and R. Ingold. Strengths and weaknesses of software architectures for the rapid creation of tangible and multimodal interfaces. In Proceedings of 2nd international conference on Tangible and Embedded Interaction (TEI 2008), pages 47–54, Bonn (Germany), 2008. [ bib ]
[548] K. Smith, S. Ba, D. Gatica-Perez, and J. M. Odobez. Tracking the visual focus of attention for a varying number of wandering people. IEEE Trans. on Pattern Analysis and Machine Intelligence,, 30(7):1212–1229, 2008. [ bib ]
[549] C. Carincotte, X. Naturel, M. Hick, J. M. Odobez, J. Yao, A. Bastide, and B. Corbucci. Understanding metro station usage using closed circuit television cameras analysis. In 11th International IEEE Conference on Intelligent Transportation Systems (ITSC), 2008. [ bib | .pdf | Abstract ]
[550] S. Zhao and N. Morgan. Multi-stream spectro-temporal features for robust speech recognition. to appear in Proceedings of Interspeech 2008, Brisbane, Australia, 2008. [ bib ]
[551] M. Soleymani, G. Chanel, J. Kierkels, and T. Pun. Valence-arousal representation of movie scenes based on multimedia content analysis and user's physiological emotional responses. In MLMI 2008, 5th Joint Workshop on Machine Learning and Multimodal Interaction, Utrecht, The Netherlands, 2008. (PhD student poster session, with extended abstract). [ bib ]
[552] E. Bruno, N. Moüenne-Loccoz, and S. Marchand-Maillet. Design of multimodal dissimilarity spaces for retrieval of multimedia documents. To appear in IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008. [ bib ]
[553] H. Ketabdar. Enhancing posterior based speech recognition systems. PhD thesis, Ecole Polytechnique Fédérale de Lausanne, Lausanne , Switzerland, 2008. Thèse Ecole polytechnique fédérale de Lausanne EPFL, no 4218 (2008), Faculté des sciences et techniques de l'ingénieur STI, Section de génie électrique et électronique, Institut de génie électrique et électronique IEL (Laboratoire de l'IDIAP LIDIAP). Dir.: Hervé Bourlard. [ bib | .pdf | Abstract ]
[554] O. Vinyals and G. Friedland. Modulation spectrogram features for speaker diarization. to appear in proceedings of Interspeech 2008, Brisbane, Australia, 2008. [ bib ]
[555] O. Vinyals and G. Friedland. Towards semantic analysis of conversations: a system for the live identification of speakers in meetings. to appear in Proceedings of IEEE International Conference on Semantic Computing, Santa Clara, CA, 2008. [ bib ]
[556] O. Vinyals and G. Friedland. Live speaker identification in meetings: "who is speaking now?". Technical Report TR-08-001, International Computer Science Institute, Berkeley, CA, 2008. [ bib ]
[557] J. Richiardi, A. Drygajlo, and L. Todesco. Promoting diversity in gaussian mixture ensembles: an application to signature verification. pages 140–149, Heidelberg, 2008. Springer. [ bib ]
[558] S. Ba and J. M. Odobez. Visual focus of attention estimation from head pose posterior probability distributions. In IEEE Proc. Int. Conf. on Multimedia and Expo (ICME), Hannover, 2008. [ bib | Abstract ]
[559] E. Grossmann, J. A. Gaspar, and F. Orabona. Calibration from statistical properties of the visual world. In European Conf. on Computer Vision, 2008. IDIAP-RR 08-63. [ bib | .ps.gz | .pdf | Abstract ]
[560] A. Popescu-Belis and R. Stiefelhagen. Machine learning for multimodal interaction v, volume 5237 of LNCS. Springer-Verlag, Berlin/Heidelberg, 2008. [ bib ]
[561] M. M. Ullah, A. Pronobis, B. Caputo, J. Luo, P. Jensfelt, and H. I. Christensen. Towards robust place recognition for robot localization. In IEEE International Conference on Robotics ad Automation, 2008. [ bib | .pdf | Abstract ]
[562] J. P. Pinto, H. Hermansky, B. Yegnanarayana, and M. Magimai-Doss. Exploiting contextual information for improved phoneme recognition. In IEEE International Conference on Acoustics, Speech and Signal Processing, (ICASSP 2008), pages 4449–4452, 2008. IDIAP-RR 07-65. [ bib | DOI | Abstract ]
[563] P. Motlicek, S. Ganapathy, and H. Hermansky. Entropy coding of quantized spectral components in fdlp audio codec. Idiap-RR Idiap-RR-71-2008, Idiap, 2008. [ bib | .pdf | Abstract ]
[564] A. Popescu-Belis. Reference-based vs. task-based evaluation of human language technology. In LREC 2008 ELRA Workshop on Evaluation: "Looking into the Future of Evaluation: When automatic metrics meet task-based and performance-based approaches", pages 12–16, Marrakech, Morocco, 2008. ELRA. [ bib | Abstract ]
[565] A. Thomas, S. Ganapathy, and H. Hermansky. Recognition of reverberant speech using frequency domain linear prediction. IEEE Signal Processing Letters, 2008. IDIAP-RR 08-41. [ bib | Abstract ]
[566] P. W. Ferrez and J. del R. Millán. Error-related eeg potentials generated during simulated brain-computer interaction. IEEE Trans. on Biomedical Engineering, 55(3):923–929, 2008. [ bib ]
[567] J. P. Pinto and H. Hermansky. Combining evidence from a generative and a discriminative model in phoneme recognition. In Proceedings of Interspeech 2008, 2008. IDIAP-RR 08-20. [ bib | Abstract ]
[568] J. Meynet and J. Ph. Thiran. Ensembles of svms using an information theoretic criterion. Pattern Recognition Letters, 2008. ITS. [ bib | Abstract ]
[569] U. Hoffmann, J. M. Vesin, T. Ebrahimi, and K. Diserens. An efficient p300-based brain-computer interface for disabled subjects. Journal of Neuroscience Methods, 167(1):115–125, 2008. Datasets and MATLAB-Code are available at http://bci.epfl.ch. [ bib | DOI | Abstract ]
[570] J. Meynet and J. Ph. Thiran. Information theoretic combination of classifiers. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008. ITS. [ bib | DOI | Abstract ]
[571] L. Gui, J. Ph. Thiran, and N. Paragios. Cooperative object segmentation and behavior inference in image sequences. International Journal of Computer Vision, 2008. [ bib | DOI ]
[572] R. Bertolami and H. Bunke. Including language model information in the combination of handwritten text line recognizers. In Proc. 11th Int. Conf. on Frontiers in Handwriting Recognition, pages 25–30, 2008. [ bib ]
[573] R. Bertolami and H. Bunke. Hidden markov model based ensemble methods for offline handwritten text line recognition. Pattern Recognition, 41(11):3452–3460, 2008. [ bib ]
[574] K. Kryszczuk and A. Drygajlo. Credence estimation and error prediction in biometric identity verification. Signal Processing, 88(4):916–925, 2008. [ bib | http ]
[575] E. Kokiopoulou, P. Frossard, and O. Verscheure. Fast keyword detection with sparse time-frequency models. IEEE Int. Conf. on Multimedia & Expo (ICME), 2008. [ bib ]
[576] K. Kryszczuk and A. Drygajlo. Impact of feature correlations on separation between bivariate normal distributions. In 19th International Conference on Pattern Recognition, Tampa, Florida, USA, 2008. [ bib ]
[577] J. P. Pinto, G. S. V. S. Sivaram, and H. Hermansky. Reverse correlation for analyzing mlp posterior features in asr. In 11th International Conference on Text, Speech and Dialogue (TSD), pages 469–476, 2008. IDIAP-RR 08-13. [ bib | DOI | Abstract ]
[578] X. Naturel and J. M. Odobez. Detecting queues at vending machines: a statistical layered approach. In Proc. Int. Conf. on Pattern Recognition (ICPR), 2008. [ bib | .pdf | Abstract ]
[579] T. Tommasi, F. Orabona, and B. Caputo. An svm confidence-based approach to medical image annotation. In C. Peters, D. Giampiccolo, and N. Ferro, editors, Evaluating Systems for Multilingual and Multimodal Information Access – 9th Workshop of the Cross-Language Evaluation Forum, LNCS, 2008. [ bib | .pdf | Abstract ]
[580] B. Mesot. Switching linear dynamical systems for noise robust speech recognition of isolated degits. PhD thesis, STI School of Engineering, EPFL, Lausanne, 2008. [ bib ]
[581] G. Gonzalez, F. Fleuret, and P. Fua. Automated delineation of dendritic networks in noisy image stacks. In Proceedings of the European Conference on Computer Vision (ECCV), pages 214–227, 2008. [ bib ]
[582] J. del R. Millán. Brain-controlled robots. In IEEE International Conference on Robotics and Automation (ICRA 2008), ATR Computational Neuroscience Laboratories, 2008. [ bib | DOI | Abstract ]
[583] D. Vergyri, A. Mandal, W. Wang, A. Stolcke, J. Zheng, M. Graciarena, D. Rybach, C. Gollan, R. Schlater, K. Kirchoff, A. Faria, and N. Morgan. Development of the sri/nightingale arabic asr system. In 9th International Conference of the ISCA (Interspeech 2008), Brisbane, Australia, pages 1437–1440, 2008. [ bib ]
[584] S. Favre, H. Salamin, and A. Vinciarelli. Role recognition in multiparty recordings using social affiliation networks and discrete distributions. In The Tenth International Conference on Multimodal Interfaces (ICMI 2008), number Idiap-RR-64-2008, 2008. [ bib | Abstract ]
[585] A. Thomas, S. Ganapathy, and H. Hermansky. Spectro-temporal features for automatic speech recognition using linear prediction in spectral domain. In 16th European Signal Processing Conference (EUSIPCO 2008), 2008. IDIAP-RR 08-05. [ bib | Abstract ]
[586] A. Faria and N. Morgan. Corrected tandem features for acoustic model training. In International Conference on Acoustics, Speech, and Signal Processing, 2008. [ bib ]
[587] L. Matena, A. Jaimes, and A. Popescu-Belis. Graphical representation of meetings on mobile devices. In MobileHCI 2008 Demonstrations (10th ACM International Conference on Human-Computer Interaction with Mobile Devices and Services), Amsterdam, 2008. [ bib | Abstract ]
[588] P. Besson, V. Popovici, J. M. Vesin, J. Ph. Thiran, and M. Kunt. Extraction of audio features specific to speech production for multimodal speaker detection. IEEE Transactions on Multimedia, 10(1):63–73, 2008. [ bib | DOI ]
[589] O. Koval, S. Voloshynovskiy, F. Caire, and P. Bas. Privacy-preserving multimodal person and object identification. In MM&Sec 2008, 2008. [ bib ]
[590] G. Bologna, B. Deville, M. Vinckenbosch, and T. Pun. a perceptual interface for vision substitution in a color matching experiment. In Proceeding on IEEE IJCNN, IEEE World congress on computational intelligence, 2008. [ bib ]
[591] J. Meynet, T. Arsan, J. Cruz Mota, and J. Ph. Thiran. Fast multi-view face tracking with pose estimation. In 16th European Signal Processing Conference, 2008. [ bib | http | Abstract ]
[592] F. Valente and H. Hermansky. On the combination of auditory and modulation frequency channels for asr applications. In Interspeech 2008, 2008. IDIAP-RR 08-12. [ bib | Abstract ]
[593] M. Gurban, J. Ph. Thiran, T. Drugman, and T. Dutoit. Dynamic modality weighting for multi-stream hmms in audio-visual speech recognition. In 10th International Conference on Multimodal Interfaces, 2008. [ bib | http | Abstract ]
[594] M. Gurban and J. Ph. Thiran. Using entropy as a stream reliability estimate for audio-visual speech recognition. In 16th European Signal Processing Conference, Lausanne, Switzerland, 2008. [ bib | http | Abstract ]
[595] P. Motlicek, S. Ganapathy, H. Hermansky, H. Garudadri, and M. Athineos. Perceptually motivated sub-band decomposition for fdlp audio coding. In Text, Speech and Dialogue, volume 5246 of Series of Lecture Notes in Artificial Intelligence (LNAI), pages 435–442. Springer-Verlag Berlin, Heidelberg, 2008. [ bib | .pdf | Abstract ]
[596] D. Roth, E. Koller-Meier, D. Rowe, T. B. Moeslund, and L. van Gool. Event-based tracking evaluation metric. In IEEE Workshop on Motion and Video Computing (WMVC), 2008. [ bib ]
[597] K. Schindler and L. van Gool. Combining densely sampled form and motion for human action recognition. In DAGM Annual Pattern Recognition Symposium. Springer, 2008. [ bib ]
[598] S. Kosinov and T. Pun. Distance-based discriminant analysis method and its applications. Pattern Analysis and Applications, 11(3-4):227–246, 2008. (DOI: 10.1007/s10044-007-0082-x). [ bib | .pdf ]
[599] M. D. Breitenstein, D. Kuettel, T. Weise, L. van Gool, and H. Pfister. Real-time face pose estimation from single range images. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08). IEEE Press, 2008. [ bib ]
[600] P. W. Ferrez and J. del R. Millán. Error-related eeg potentials generated during simulated brain-computer interaction. IEEE Transactions on Biomedical Engineering, 55(3):923–929, 2008. [ bib | DOI | Abstract ]
[601] O. Koval, S. Voloshynovskiy, and T. Pun. Privacy-preserving multimodal person and object identification. In Proceedings of the 10th ACM Workshop on Multimedia & Security, Oxford, UK, 2008. [ bib ]
[602] W. Li, M. M. Doss, J. Dines, and H. Bourlard. Mlp-based log spectral energy mapping for robust overlapping speech recognition. In European Signal Processing Conference, 2008. [ bib ]
[603] W. Li. Effective post-processing of single-channel frequency-domain speech enhancement. In IEEE conference on multimedia and expo, 2008. [ bib ]
[604] W. Li, J. Dines, M. Magimai-Doss, and H. Bourlard. Neural network based regression for robust overlapping speech recognition using microphone arrays. In Interspeech, 2008. [ bib ]
[605] H. Hung and D. Gatica-Perez. Identifying dominant people in meetings from audio-visual sensors. In Proc. IEEE Int. Conf. on Automatic Face and Gesture Recognition, Special Session on Multimodal HCI for Smart Environments, Amsterdam, The Netherlands, 2008. [ bib | Abstract ]
[606] A. Humm. Modelling combined handwriting and speech modalities for user authentication. PhD thesis, University of Fribourg, Switzerland, 2008. [ bib ]
[607] S. Kosinov, E. Bruno, and S. Marchand-Maillet. Spatially-consistent partial matching for intra- and inter-image prototype selection. To appear in Signal Processing: Image Communication special issue on "Semantic Analysis for Interactive Multimedia Services", 2008. [ bib ]
[608] J. Anemuller, J. H. Back, B. Caputo, J. Luo, F. Ohl, F. Orabona, R. Vogels, D. Weinshall, and A. Zweig. Biologically motivated audio-visual cue integration for object. In Proceedings of the first Internatinal Conference on Cognitive Systems, 2008. [ bib | .pdf | Abstract ]
[609] D. Jayagopi, H. Hung, C. Yeo, and D. Gatica-Perez. Modeling dominance in group conversations from nonverbal activity cues. IEEE Trans. on Audio, Speech and Language Processing, Special Issue on Multimodal Processing for Speech-based Interactions, accepted for publication, 2008. [ bib ]
[610] H. Ketabdar and H. Bourlard. In-context phone posteriors as complementary features for tandem asr. In ICSLP'08, 2008. [ bib | Abstract ]
[611] B. Schouten, N. Juul, A. Drygajlo, and M. Tistarelli. Biometrics and identity management. Heidelberg, 2008. Springer. [ bib ]
[612] K. Kryszczuk and A. Drygajlo. Impact of feature correlations on separation between bivariate normal distributions. Tampa, Florida, USA, 2008. [ bib ]
[613] M. Ouaret, F. Dufaux, and T. Ebrahimi. Enabling privacy for distributed video coding by transform domain scrambling. In 2008 SPIE Visual Communications and Image Processing, 2008. [ bib | http | Abstract ]
[614] F. Beekhof, S. Voloshynovskiy, O. Koval, and R. Villán. Secure surface identification codes. In E. J. Delp III, P. W. Wong, J. Dittmann, and N. D. Memon, editors, Steganography, and Watermarking of Multimedia Contents X, volume 6819 of Proceedings of SPIE, (SPIE, Bellingham, WA 2008) 68190D, 2008. [ bib | DOI ]
[615] A. Stolcke, X. Anguera, K. Boakye, O. Cetin, A. Janin, M. Magimai-Doss, C. Wooters, and J. Zheng. The sri-icsi spring 2007 meeting and lecture recognition system. In Multimodal Technologies for Perception of Humans. Lecture Notes in Computer Science, 2008. [ bib ]
[616] P. N. Garner. Silence models in weighted finite-state transducers. In Interspeech, 2008. IDIAP-RR 08-19. [ bib | Abstract ]
[617] K. Boakye, O. Vinyals, and G. Friedland. Two's a crowd: improving speaker diarization by automatically identifying and excluding overlapped speech. In Interspeech, 2008. [ bib ]
[618] M. Liwicki, A. Schlapbach, and H. Bunke. Writer-dependent recognition of handwritten whiteboard notes in smart meeting room environments. In Proc. 8th IAPR Int. Workshop on Document Analysis Systems, pages 151–157, 2008. [ bib ]
[619] S. Ganapathy, P. Motlicek, H. Hermansky, and H. Garudadri. Spectral noise shaping: improvements in speech/audio codec based on linear prediction in spectral domain. In INTERSPEECH 2008, 2008. IDIAP-RR 08-16. [ bib | Abstract ]
[620] D. Jayagopi. Predicting the dominant clique in meetings through fusion of nonverbal cues. In Proc. ACM Vancouver, Canada, Canada, 2008. [ bib ]
[621] O. Koval, S. Voloshynovskiy, F. Beekhof, and T. Pun. Analysis of physical unclonable identification based on reference list decoding. In E. J. Delp III, P. W. Wong, J. Dittmann, and N. D. Memon, editors, Steganography, and Watermarking of Multimedia Contents X, volume 6819 of Proceedings of SPIE, (SPIE, Bellingham, WA 2008) 68190B, 2008. [ bib ]
[622] L. Perruchoud. The anterior cingulate cortex. Idiap-Com Idiap-Com-02-2008, IDIAP, April 2008. [ bib | .pdf ]
[623] B. Mesot. Inference in switching linear dynamical systems applied to noise robust speech recognition of isolated digits. PhD thesis, Ecole Polytechnique Fédérale de Lausanne, May 2008. Thèse Ecole polytechnique fédérale de Lausanne EPFL, no 4059 (2008), Faculté des sciences et techniques de l'ingénieur STI, Section de génie électrique et électronique, Institut de génie électrique et électronique IEL (Laboratoire de l'IDIAP LIDIAP). Dir.: Hervé Bourlard. [ bib | .pdf | Abstract ]
[624] F. Galán. Methods for Asynchronous and Non-Invasive EEG-Based Brain-Computer Interfaces. Towards Intelligent Brain-Actuated Wheelchairs. PhD thesis, University of Barcelona, June 2008. [ bib | .pdf ]
[625] S. H. K. Parthasarathi, P. Motlicek, and H. Hermansky. Exploiting temporal context for speech/non-speech detection. Idiap-RR Idiap-RR-21-2008, IDIAP, September 2008. [ bib | .ps.gz | .pdf | Abstract ]
[626] W. Li, K. Kumatani, J. Dines, M. Magimai-Doss, and H. Bourlard. A neural network based regression approach for recognizing simultaneous speech. Idiap-RR Idiap-RR-10-2008, IDIAP, September 2008. Submitted for publication. [ bib | .ps.gz | .pdf | Abstract ]
[627] S. Ganapathy, P. Motlicek, and H. Hermansky. Low-delay error resilient speech coding using sub-band hilbert envelopes. Idiap-RR Idiap-RR-75-2008, Idiap, September 2008. [ bib | .pdf ]
[628] R. A. Negoescu and D. Gatica-Perez. Topickr: Flickr groups and users reloaded. In MM '08: Proc. of the 16th ACM Intl. Conf. on Multimedia, Vancouver, Canada, October 2008. ACM. [ bib | Abstract ]
[629] Y. Grandvalet, A. Rakotomamonjy, J. Keshet, and S. Canu. Support vector machines with a reject option. In Proceedings of the 22nd Annual Conference on Neural Information Processing Systems, December 2008. [ bib | .pdf | Abstract ]
[630] P. Motlicek. Automatic out-of-language detection based on confidence measures derived fromlvcsr word and phone lattices. In 10thAnnual Conference of the International Speech Communication Association, 2009 ISCA. ISCA, 2009. [ bib | Abstract ]
[631] G. Heusch and S. Marcel. Bayesian networks to combine intensity and color information in face recognition. Idiap-RR Idiap-RR-27-2009, Idiap, 2009. [ bib | .pdf ]
[632] S. Ganapathy, P. Motlicek, and H. Hermansky. Error resilient speech coding using sub-band hilbert envelopes. In 12th International Conference on Text, Speech and Dialogue, TSD 2009, LNAI 5729. Springer - Verlag, Berlin Heidelberg 2009, 2009. [ bib | Abstract ]
[633] T. Weyand, T. Deselaers, and H. Ney. Log-linear mixtures for object recognition. In British Machine Vision Conference, 2009. [ bib ]
[634] T. Weise, T. Wismer, B. Leibe, and L. Van Gool. In-hand scanning with online loop closure. In IEEE International Workshop on 3-D Digital Imaging and Modeling, 2009. [ bib ]
[635] T. Gass, T. Deselaers, and H. Ney. Deformation-aware log-linear models. In Deutsche Arbeitsgemeinschaft für Mustererkennung Symposium, 2009. [ bib ]
[636] S. Pellegrini, A. Ess, K. Schindler, and L. van Gool. You'll never walk alone: Modeling social behavior for multi-target tracking. In International Conference on Computer Vision, 2009. [ bib ]
[637] S. Stalder, H. Grabner, and L. Van Gool. Beyond semi-supervised tracking: Tracking should be as simple as detection, but not simpler than recognition. In OLCV 09: 3rd On-line learning for Computer Vision Workshop, 2009. [ bib ]
[638] S. Gammeter, L. Bossard, T. Quack, and L. Van Gool. I know what you did last summer: object-level auto-annotation of holiday snaps. In International Conference on Computer Vision, 2009. [ bib ]
[639] Peter Gehler and Sebastian Nowozin. On feature combination for multiclass object classification. In Proceedings of the Twelfth IEEE International Conference on Computer Vision, 2009. [ bib ]
[640] Peter Gehler and Sebastian Nowozin. Let the kernel figure it out: Principled learning of pre-processing for kernel classifiers. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2009. [ bib ]
[641] N. Hasler, B. Rosenhahn, T. Thormüahlen, M. Wand, J. Gall, and H. P. Seidel. Markerless motion capture with unsynchronized moving cameras. In IEEE Conference on Computer Vision and Pattern Recognition, 2009. [ bib ]
[642] N. Bellotto, E. Sommerlade, B. Benfold, C. Bibby, I. Reid, D. Roth, L. Van Gool, C. Fernandez, and J. Gonzalez. A distributed camera system for multi-resolution surveillance. In Third ACM/IEEE International Conference on Distributed Smart Cameras, 2009. [ bib ]
[643] M. D. Breitenstein, Helmut Grabner, and Luc Van Gool. Hunting nessie – real-time abnormality detection from webcams. In IEEE International Workshop on Visual Surveillance, 2009. [ bib ]
[644] M. D. Breitenstein, Fabian Reichlin, B. Leibe, E. Koller-Meier, and Luc Van Gool. Robust tracking-by-detection using a detector confidence particle filter. In IEEE International Conference on Computer Vision, 2009. [ bib ]
[645] M. Van den Bergh, J. Halatsch, A. Kunze, F. Bosche, L. Van Gool, and G. Schmitt. Towards collaborative interaction with large nd models for effective project management. In 9th International Conference on Construction Applications of Virtual Reality, 2009. [ bib ]
[646] M. Van den Bergh, F. Bosche, E. Koller-Meier, and L. Van Gool. Haarlet-based hand gesture recognition for 3d interaction. In Proceedings of the IEEE Workshop on Applications of Computer Vision, 2009. [ bib ]
[647] M. Shaheen, J. Gall, R. Strzodka, L. Van Gool, and H. P. Seidel. A comparison of 3d model-based tracking approaches for human motion capture in uncontrolled environments. In IEEE Workshop on Applications of Computer Vision, 2009. [ bib ]
[648] L. Wu, S. C. Hoi, R. Jin, J. Zhu, and N. Yu. Distance metric learning from uncertain side information with application to automated photo taggin. In ACM Multimedia 2009, 2009. [ bib ]
[649] L. Jie, Barbara Caputo, and V. Ferrari. Who's doing what: Joint modeling of names and verbs for simultaneous face and pose annotation. In Advances in Neural Information Processing Systems, 2009. [ bib ]
[650] K. Dylla, P. Müller, A. Ulmer, S. Haegler, and B. Fischer. Rome reborn 2.0: A framework for virtual city reconstruction using procedural modeling techniques. In Proceedings of Computer Applications and Quantitative Methods in Archaeology, 2009. [ bib ]
[651] J. Zhu, L. Van Gool, and S. C. Hoi. Unsupervised face alignment by robust nonrigid mapping. In ICCV2009, 2009. [ bib ]
[652] J. Gall, C. Stoll, E. de Aguiar, C. Theobalt, B. Rosenhahn, and H. P. Seidel. Motion capture using joint skeleton tracking and surface estimation. In IEEE Conference on Computer Vision and Pattern Recognition, 2009. [ bib ]
[653] J. Gall and V. Lempitsky. Class-specific hough forests for object detection. In IEEE Conference on Computer Vision and Pattern Recognition, 2009. [ bib ]
[654] Henning Hamer, K. Schindler, E. Koller-Meier, and Luc Van Gool. Tracking a hand manipulating an object. In IEEE International Conference on Computer Vision, 2009. [ bib ]
[655] Gideon Aschwanden, S. Haegler, Jan Halatsch, Raphael Jecker, Gerhard Schmitt, and Luc Van Gool. Evaluation of 3d city models using automatic placed urban agents. In CONVR, 2009. [ bib ]
[656] G. Fanelli, J. Gall, and L. Van Gool. Hough transform-based mouth localization for audio-visual speech recognition. In British Machine Vision Conference, 2009. [ bib ]
[657] Fabian Nater, Helmut Grabner, T. Jaeggli, and Luc Van Gool. Tracker trees for unusual event detectio. In IEEE International Workshop on Visual Surveillance, 2009. [ bib ]
[658] Marcin Eichner and V. Ferrari. Better appearance models for pictorial structures. In British Machine Vision Conference, 2009. [ bib ]
[659] Alain Lehmann, B. Leibe, and Luc Van Gool. Prism: Principled implicit shape model. In British Machine Vision Conference, 2009. [ bib ]
[660] Alain Lehmann, B. Leibe, and Luc Van Gool. Feature-centric efficient subwindow search. In IEEE International Conference on Computer Vision, 2009. [ bib ]
[661] A. Ess, K. Schindler, B. Leibe, and L. van Gool. Improved multi-person tracking with active occlusion handling. In ICRA Workshop on People Detection and Tracking, 2009. [ bib ]
[662] A. Ess, B. Leibe, K. Schindler, and L. Van Gool. Moving obstacle detection in highly dynamic scenes. In IEEE International Conference on Robotics and Automation, 2009. [ bib ]
[663] V. Ferrari, M. Marin, , and A. Zisserman. 2d human pose estimation in tv shows. In D. Cremers, B. Rosenhahn, A. Yuille, and F. Schmidt, editors, Statistical and Geometrical Approaches to Visual Motion Analysis, pages 128–147. Springer, 2009. [ bib ]
[664] Peter Gehler and Bernhard Schölkopf. An introduction to kernel learning algorithms. In Gustavo Camps-Valls and Lorenzo Bruzzone, editors, Kernel Methods for Remote Sensing Data Analysis, pages 39–60. Wiley, 2009. [ bib ]
[665] Michael Van den Bergh, Roland Kehl, E. Koller-Meier, and Luc Van Gool. Real-time 3d body pose estimation. In Hamid Aghajan and Andrea Cavallaro, editors, Multi-Camera Networks: Concepts and Applications, pages 335–360. Elsevier, 2009. [ bib ]
[666] S. Haegler, P. Müller, and Luc Van Gool. Procedural modeling for digital cultural heritage. EURASIP Journal on Image and Video Processing, 2009, 2009. [ bib ]
[667] Michael Van den Bergh, E. Koller-Meier, and Luc Van Gool. Real-time body pose recognition using 2d or 3d haarlets. International Journal of Computer Vision, 83:72–84, 2009. [ bib ]
[668] I. K. Park, M. Germann, M. D. Breitenstein, and H. Pfister. Fast and automatic object pose estimation for range images on the gpu. Machine Vision and Applications, 2009. [ bib ]
[669] F. Bosché, C. T. Haas, and B. Akinci. Automated recognition of 3d cad objects in site laser scans for project 3d status visualization and performance control. ASCE Journal of Computing in Civil Engineering, 23(6):311–318, 2009. [ bib ]
[670] D. Roth, E. Koller-Meier, and Luc Van Gool. Multi-object tracking evaluated on sparse events. Multimedia Tools and Applications, 2009. [ bib ]
[671] A. Thomas, V. Ferrari, B. Leibe, T. Tuytelaars, and L. Van Gool. Using multi-view recognition to guide a robot. International Journal of Robotics Research, 28(8):976–998, 2009. [ bib ]
[672] A. Thomas, V. Ferrari, B. Leibe, T. Tuytelaars, and L. Van Gool. Shape-from-recognition: Recognition enables meta-data transfer. Computer Vision and Image Understanding, 113(12):1222–1234, 2009. [ bib ]
[673] A. Ess, B. Leibe, K. Schindler, and L. Van Gool. Robust multi-person tracking from a mobile platform. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(10):1831–1846, 2009. [ bib ]
[674] Virginia Estellers, M. Gurban, and J. Ph. Thiran. SELECTING RELEVANT VISUAL FEATURES FOR SPEECHREADING. In Proc. of the IEEE International Conference on Image Processing, Cairo, Egypt, 2009. [ bib | http | http | Abstract ]
[675] L. Gui, J. Ph. Thiran, and N. Paragios. Cooperative Object Segmentation and Behavior Inference in Image Sequences. International Journal of Computer Vision, 84(2):146–162, 2009. [ bib | http | Abstract ]
[676] J. Kierkels, M. Soleymani, and T. Pun. Queries and tags in affect-based multimedia retrieval. In International Conference on Multimedia and Expo, Special Session on Implicit Tagging, 2009. [ bib ]
[677] J. Kierkels and T. Pun. Simultaneous exploitation of explicit and implicit tags in affect-based multimedia retrieval. In International Conference on Affective Computing and Intelligent Interaction, pages 274–279, 2009. [ bib ]
[678] M. Soleymani, J. Kierkels, G. Chanel, and T. Pun. A bayesian framework for video affective representation. In International Conference on Affective Computing and Intelligent Interaction, pages 267–273, 2009. [ bib ]
[679] M. Soleymani, J. Davis, and T. Pun. A collaborative personalized affective video retrieval system. In International Conference on Affective Computing and Intelligent Interaction, pages 588–589, 2009. [ bib ]
[680] J. Kierkels, M. Soleymani, and T. Pun. Identification of narrative peaks in clips: text features perform best. In VideoCLEF 2009, Cross Language Evaluation Forum (CLEF) Workshop, ECDL 200, 2009. [ bib ]
[681] B. Deville, G. Bologna, M. Vinckenbosch, and T. Pun. See color: Seeing colours with an orchestra. In D. Lalanne and J. Kohlas, editors, Human Machine Interaction, Research Results of the MMI Program, pages 251–279. Springer LNCS, 2009. [ bib ]
[682] B. Dumas, D. Lalanne, and R. Ingold. Hephaistk: A toolkit for rapid prototyping of multimodal interfaces. In Proceedings of International Conference on Multimodal Interfaces and Workshop on Machine Learning for Multi-modal Interaction (ICMI-MLMI 2009), pages 231–232, 2009. [ bib ]
[683] B. Dumas, D. Lalanne, and R. Ingold. Description languages for multimodal interaction: a set of guidelines. Journal on Multimodal User Interfaces, 3, 2009. [ bib ]
[684] F. Evéquoz. An ethnographically-inspired survey of pim strategies. technical report. Technical report, Department of Informatics, University of Fribourg, Switzerland, 2009. [ bib ]
[685] F. Evéquoz and D. Lalanne. "i thought you would show me how to do it" – studying and supporting pim strategy changes. In Proceedings of ASIS&T PIM Workshop (ASIS&T 2009), 2009. [ bib ]
[686] E. Bertini and D. Lalanne. Investigating and reflecting on the integration of automatic data analysis and visualization in knowledge discovery. ACM SIGKDD Explorations, 22, 2009. [ bib ]
[687] Pascal Bruegger, D. Lalanne, A. Lisowska, and B. Hirsbrunner. A method and tools for designing and prototyping activity-based pervasive applications. In Proceedings of 7th International Conference on Advances in Mobile Computing & Multimedia (ACM MoMM 2009), pages 129–136, 2009. [ bib ]
[688] Dalila Mekhaldi and D. Lalanne. Joining meeting documents to strengthen multimodal thematic alignment. In Proceedings of 5th International Conference on Signal Image Technology and Internet Based Systems (SITIS 2009), pages 88–96, 2009. [ bib ]
[689] F. De Simone, L. Goldmann, V. Baroncini, and T. Ebrahimi. Subjective evaluation of JPEG XR image compression. In Proceedings of SPIE, volume 7443, San Diego, California, USA, 2009. [ bib ]
[690] J. S. Lee and T. Ebrahimi. Efficient video coding in H.264/AVC by using audio-visual information. In Proceedings of the IEEE International Workshop on Multimedia Signal Processing, Rio de Janeiro, Brazil, 2009. [ bib ]
[691] A. Yazdani, J. S. Lee, and T. Ebrahimi. Implicit emotional tagging of multimedia using EEG signals and brain computer interface. In Proceedings of the International Workshop on Social Media, pages 81–88, Beijing, China, 2009. [ bib ]
[692] P. Vajda, L. Goldmann, and T. Ebrahimi. Analysis of the limits of graph-based object duplicate detection. In Prooceedings of the IEEE International Symposium on Multimedia, pages 600–605, San Diego, California, USA, 2009. [ bib ]
[693] F Kaplan, S Do-Lenh, K Bachour, G. Y Kao, C Gault, and P Dillenbourg. Interpersonal Computers for Higher Education. In P Dillenbourg, J Huang, and M Cherubini, editors, Interactive Artifacts and Furniture Supporting Collaborative Work and Learning, Computer-Supported Collaborative Learning Series, pages 129–145. Springer US, 2009. [ bib | http ]
[694] Marc-Antoine Nuessli, Patrick Jermann, Mirweis Sangin, and Pierre Dillenbourg. Collaboration and abstract representations: towards predictive models based on raw speech and eye-tracking data. In CSCL '09: Proceedings of the 2009 conference on Computer support for collaborative learning. International Society of the Learning Sciences, 2009. Invited Paper. [ bib | http | http | Abstract ]
[695] Guillaume Zufferey, Patrick Jermann, Son Do Lenh, and Pierre Dillenbourg. Using Augmentations as Bridges from Concrete to Abstract Representations. In Proceedings of the 23rd British HCI Group Annual Conference on HCI 2009: Celebrating People and Technology, pages 130–139, Swinton, 2009. British Computer Society. [ bib | http | http | Abstract ]
[696] Alexander Sproewitz, A. Billard, Pierre Dillenbourg, and Auke Jan Ijspeert. Roombots-Mechanical Design of Self-Reconfiguring Modular Robots for Adaptive Furniture. In Proceedings of 2009 IEEE International Conference on Robotics and Automation, pages 4259–4264, 2009. [ bib | DOI | .pdf | Abstract ]
[697] P. Estrella, A. Popescu-Belis, and M. King. The femti guidelines for contextual mt evaluation: principles and tools. Linguistica Antverpiensia New Series, 8, 2009. [ bib ]
[698] S. Favre. Social network analysis in multimedia indexing: Making sense of people in multiparty recordings. In Proceedings of the Doctoral Consortium of the International Conference on Affective Computing & Intelligent Interaction (ACII), pages 25–32, 2009. [ bib | .pdf | Abstract ]
[699] J. Galbally, C. McCool, J. Fierrez, S. Marcel, and J. Ortega-Garcia. On the vulnerability of face verification systems to hill-climbing attacks. Pattern Recognition, 2009. [ bib | Abstract ]
[700] J. Ph. Thiran, H. Bourlard, and F. Marques. Multimodal Signal Processing: Methods and Techniques to Build Multimodal Interactive Systems. Academic Press, 2009. [ bib | Abstract ]
[701] D. Gatica-Perez. Modeling interest in face-to-face conversations from multimodal nonverbal behavior. In In J.-P. Thiran, H. Bourlard, and F. Marques, (Eds.), Multimodal Signal Processing, Academic Press. Academic Press, 2009. [ bib | .pdf ]
[702] A. Popescu-Belis. Managing multimodal data, metadata and annotations: Challenges and solutions. In J. Ph. Thiran, F. Marques, and H. Bourlard, editors, Multimodal Signal Processing for Human-Computer Interaction, pages 183–203. Elsevier / Academic Press, London, 2009. [ bib ]
[703] G. Friedland, H. Hung, and Chuohao Yeo. Multi-modal speaker diarization of real-world meetings using compressed-domain video features. In International Conference on Audio, Speech and Signal Processing, 2009. [ bib | .pdf | Abstract ]
[704] K. Farrahi and D. Gatica-Perez. Learning and predicting multimodal daily life patterns from cell phones. In ICMI-MLMI, 2009. [ bib | .pdf | Abstract ]
[705] J. Berclaz, A. Shahrokni, F. Fleuret, James Ferryman, and P. Fua. Evaluation of probabilistic occupancy map people detection for surveillance systems. In Proceedings of the IEEE International Workshop on Performance Evaluation of Tracking and Surveillance, 2009. [ bib | Abstract ]
[706] G. Heusch and S. Marcel. Bayesian networks to combine intensity and color information in face recognition. In International Conference on Biometrics [631], pages 414–423. [ bib | .pdf ]
[707] G. Heusch. Bayesian Networks as Generative Models for Face Recognition. PhD thesis, EPFL, 2009. [ bib | .pdf ]
[708] G. Heusch and S. Marcel. A novel statistical generative model dedicated to face recognition. In Image & Vision Computing [94]. in press. [ bib | .pdf ]
[709] E. Indermühle, M. Liwicki, and H. Bunke. Combining alignment results for historical handwritten document analysis. In Proc. 10th Int. Conf. on Document Analysis and Recognition, volume 3, pages 1186–1190, 2009. [ bib ]
[710] G. Gonzalez, F. Aguet, F. Fleuret, M. Unser, and P. Fua. Steerable features for statistical 3d dendrite detection. In Proceedings of the International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), 2009. (to appear). [ bib ]
[711] D. Imseng and G. Friedland. Robust speaker diarization for short speech recordings. In Proceedings of the IEEE workshop on Automatic Speech Recognition and Understanding, 2009. [ bib | .pdf | Abstract ]
[712] T. Tommasi and B. Caputo. The more you know, the less you learn: from knowledge transfer to one-shot learning of object categories. In BMVC, 2009. [ bib | .pdf | Abstract ]
[713] Q. A. Le and A. Popescu-Belis. Automatic vs. human question answering over multimedia meeting recordings. In Interspeech 2009 (10th Annual Conference of the International Speech Communication Association), Brighton, UK, 2009. [ bib ]
[714] S. Ba and J. M. Odobez. Recognizing human visual focus of attention from head pose in meetings. IEEE Trans. on System, Man and Cybernetics: part B, Man, 39(1):16–34, 2009. [ bib ]
[715] J. Keshet and D. Chazan. A kernel wrapper for phoneme sequence recognition. In J. Keshet and S. Bengio, editors, Automatic Speech and Speaker Recognition: Large Margin and Kernel Methods. John Wiley and Sons, 2009. [ bib | Abstract ]
[716] S. Xie, B. Favre, D. Hakkani-Tur, and Y. Liu. Leveraging sentence weights in a concept-based optimization framework for extractive meeting summarization. In 10th International Conference of the International Speech Communication Association, Brighton, UK, 2009. [ bib ]
[717] A. Graves, M. Liwicki, S. Fernandez, R. Bertolami, H. Bunke, and J. Schmidhuber. A novel connectionist system for unconstrained handwriting recognition. IEEE Trans. PAMI, 31(5):855–869, 2009. [ bib ]
[718] J. Luo, F. Orabona, and B. Caputo. An online framework for learning novel concepts over multiple cues. In Proceeding of The 9th Asian Conference on Computer Vision, 2009. [ bib | .pdf | Abstract ]
[719] A. Vinciarelli, M. Pantic, and H. Bourlard. Social signal processing: Survey of an emerging domain. Image and Vision Computing, 2009. to appear. [ bib | .pdf | Abstract ]
[720] G. Bologna, B. Deville, and T. Pun. On the use of the auditory pathway to represent image scenes in real-time. Neurocomputing, 72:839–849, 2009. [ bib ]
[721] D. Vijayasenan, F. Valente, and H. Bourlard. An information theoretic approach to speaker diarization of meeting data. IEEE Transactions on Audio Speech and Language Processing, 17(7):1382–1393, 2009. [ bib | DOI | .pdf | Abstract ]
[722] A. Popescu-Belis. Comparing meeting browsers using a task-based evaluation method. Idiap-RR Idiap-RR-11-2009, Idiap, 2009. [ bib | .pdf | Abstract ]
[723] M. Wuthrich, M. Liwicki, A. Fischer, E. Indermühle, H. Bunke, G. Viehhauser, and M. Stolz. Language model integration for the recognition of handwritten medieval documents. In Proc. 10th Int. Conf. on Document Analysis and Recognition, volume 1, pages 211–215, 2009. [ bib ]
[724] D. Morrison, E. Bruno, and S. Marchand-Maillet. capturing the semantics of user interaction: a review and case study. In Emergent Web Intelligence. Springer, 2009. [ bib ]
[725] L. Gottlieb and G. Friedland. On the use of artificial conversation data for speaker recognition in cars. In IEEE International Conference for Semantic Computing, Berkeley, USA, 2009. [ bib ]
[726] D. Imseng. Novel initialization methods for speaker diarization. Idiap-RR Idiap-RR-07-2009, Idiap, 2009. Master's thesis. [ bib | .pdf | Abstract ]
[727] S. Favre, A. Dielmann, and A. Vinciarelli. Automatic role recognition in multiparty recordings using social networks and probabilistic sequential models. In ACM International Conference on Multimedia, To Appear, 2009. [ bib | .pdf | Abstract ]
[728] G. Bologna, B. Deville, and T. Pun. Blind navigation along a sinuous path by means of the see color interface. In IWINAC2009, 3rd International Work-conference on the Interplay between Natural and Artificial Computation, Santiago de Compostela, Spain, June 22–27, 2009. [ bib ]
[729] D. Hakkani-Tur. Towards automatic argument diagramming of multiparty meetings. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Taipei, Taiwan, 2009. [ bib ]
[730] F. Orabona, C. Castellini, B. Caputo, A. E. Fiorilla, and G. Sandini. Model adaptation with least-square svm for adaptive hand prosthetics. In IEEE International conference on Robotics and Automation, 2009. [ bib | .pdf ]
[731] P. N. Garner, J. Dines, T. Hain, A. El Hannani, M. Karafiat, D. Korchagin, M. Lincoln, V. Wan, and L. Zhang. Real-time asr from meetings. In Proceedings of Interspeech, 2009. [ bib | .pdf | Abstract ]
[732] X. Perrin, F. Colas, C. Pradalier, and R. Siegwart. Learning human habits and reactions to external events with a dynamic bayesian network. Technical report, Autonomous Systems Lab, ETHZ, 2009. [ bib ]
[733] G. Friedland, O. Vinyals, Y. Huang, and C. Muller. Prosodic and other long-term features for speaker diarization. IEEE Transactions on Audio, Speech and Language Processing, 17(5):985–993, 2009. [ bib ]
[734] D. Morrison, S. Marchand-Maillet, and E. Bruno. Modelling long-term relevance feedback. In Proceedings of the ECIR Workshop on Information Retrieval over Social Networks, Toulouse, FR, 2009. [ bib | .pdf ]
[735] B. Deville, G. Bologna, M. Vinckenbosch, and T. Pun. See color: seeing colours with an orchestra. In D. Lalanne and J. Kohlas, editors, Human Machine Interaction: Research Results of the MMI Program, volume 5440 of Lecture Notes in Computer Science, pages 251–279. Springer, 2009. Subseries: Programming and Software Engineering. [ bib ]
[736] M. M. Ullah, F. Orabona, and B. Caputo. You live, you learn, you forget: continuous learning of visual places with a forgetting mechanism. In International Conference on Robotic and Systems, 2009. [ bib ]
[737] F. Monay, P. Quelhas, J. M. Odobez, and D. Gatica-Perez. Contextual classification of image patches with latent aspect models. EURASIP Journal on Image and Video Processing, Special Issue on Patches in Vision, 2009. to appear. [ bib | .pdf | Abstract ]
[738] M. Magimai-Doss, G. Aradilla, and H. Bourlard. On joint modelling of grapheme and phoneme information using kl-hmm for asr. Idiap-RR Idiap-RR-24-2009, Idiap, 2009. [ bib | .pdf | Abstract ]
[739] F. Orabona, C. Castellini, B. Caputo, J. Luo, and G. Sandini. Towards life-long learning for cognitive systems: Online independent support vector machine. Pattern Recognition, Accepted for Pub, 2009. [ bib ]
[740] A. Vinciarelli. Capturing order in social interactions. IEEE Signal Processing Magazine, 2009. [ bib | .pdf | Abstract ]
[741] G. Friedland, H. Hung, and C. Yeo. Multi-modal speaker diarization of real-world meetings using compressed-domain video features. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Taipei, Taiwan, pages 4069–4072, 2009. [ bib ]
[742] G. Friedland, O. Vinyals, Y. Huang, and C. Muller. Fusion of short-term and long-term features for improved speaker diarization. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Taipei, Taiwan, pages 4077–4080, 2009. [ bib ]
[743] G. Friedland and D. van Leeuwen. Speaker diarization and identification. IEEE Press/Wiley, 2009. [ bib ]
[744] G. Friedland, C. Yeo, and H. Hung. Visual speaker localization aided by acoustic models (full paper). In Proceedings of ACM Multimedia, Beijing, China, 2009. [ bib ]
[745] F. Orabona, B. Caputo, A. Fillbrandt, and F. Ohl. A theoretical framework for transfer of knowledge across modalities in artificial and cognitive systems. In International Conference on Developmental Learning, 2009. [ bib | .pdf ]
[746] D. Gillick, K. Riedhammer, B. Favre, and D. Hakkani-Tur. A global optimization framework for meeting summarization. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Taipei, Taiwan, 2009. [ bib ]
[747] A. Vinciarelli, N. Suditu, and M. Pantic. Implicit human centered tagging. In Proceedings of IEEE Conference on Multimedia and Expo, pages 1428–1431, 2009. [ bib | .pdf | Abstract ]
[748] D. Lalanne and J. Kholas. Human machine interaction. 2009. [ bib ]
[749] S. Voloshynovskiy, O. Koval, F. Beekhof, and T. Holotyak. Binary robust hashing based on probabilistic bit reliability. In IEEE Workshop on Statistical Signal Processing 2009, 2009. [ bib ]
[750] V. Frinken and H. Bunke. Evaluating retraining rules for semi-supervised learning in neural network based cursive word recognition. In Proc. 10th Int. Conf. on Document Analysis and Recognition, volume 1, pages 31–35, 2009. [ bib ]
[751] A. Pronobis and B. Caputo. Cold: The cosy localization database. International Journal of Robotics Research, 28(5):588–594, 2009. [ bib | .pdf ]
[752] G. Chanel, J. Kierkels, M. Soleymani, and T. Pun. short-term emotion assessment in a recall paradigm. International Journal of Human-Computer Studies, 67(8):607–627, 2009. DOI: http://dx.doi.org/10.1016/j.ijhcs.2009.03.005. [ bib | http ]
[753] O. Koval, S. Voloshynovskiy, F. Caire, and P. Bas. On security threats for robust perceptual hashin. In Electronic Imaging 2009, 2009. [ bib ]
[754] S. H. K. Parthasarathi, M. Magimai-Doss, D. Gatica-Perez, and H. Bourlard. Speaker change detection with privacy-preserving audio cues. In Proceedings of ICMI-MLMI 2009, 2009. [ bib | .pdf | Abstract ]
[755] J. Ortega-Garcia, J. Fierrez, F. Alonso-Fernandez, J. Galbally, M. R. Freire, J. Gonzalez-Rodriguez, C. Garcia-Mateo, J. L. Alba-Castro, E. Gonzalez-Agulla, E. Otero-Muras, S. Garcia-Salicetti, L. Allano, B. Ly-Van, B. Dorizzi, J. Kittler, T. Bourlai, N. Poh, F. Deravi, M. W. R. Ng, M. Fairhurst, J. Hennebert, A. Humm, M. Tistarelli, L. Brodo, J. Richiardi, A. Drygajlo, H. Ganster, F. M. Sukno, S. K. Pavani, A. Frangi, L. Akarun, and A. Savran. The multi-scenario multi-environment biosecure multimodal database (bmdb). IEEE Trans. on Pattern Analysis and Machine Intelligence, 2009. to appear. [ bib ]
[756] K. Kryszczuk and A. Drygajlo. Improving biometric verification with class-independent quality information. IET Signal Processing, Special Issue on Biometric Recognition, 3(4):310–321, 2009. [ bib ]
[757] P. N. Garner. Snr features for automatic speech recognition. In Proceedings of the IEEE workshop on Automatic Speech Recognition and Understanding [848]. [ bib | .pdf | Abstract ]
[758] S. Ganapathy, P. Motlicek, and H. Hermansky. Error resilient speech coding using sub-band hilbert envelopes. In 12th International Conference on Text, Speech and Dialogue, TSD 2009 [632], pages 355–362. [ bib | .pdf | Abstract ]
[759] J. Baker, L. Deng, J. Glass, S. Khudanpur, C. H. Lee, N. Morgan, and D. O'Shgughnessy. Research developments and directions in speech recognition and understanding. IEEE Signal Processing Magazine, 26(3):75–80, 2009. [ bib ]
[760] G. Garau, S. Ba, H. Bourlard, and J. M. Odobez. Investigating the use of visual focus of attention for audio-visual speaker diarisation. In Proceedings of the ACM International Conference on Multimedia, 2009. [ bib | .pdf | Abstract ]
[761] J. Richiardi, K. Kryszczuk, and A. Drygajlo. Static models of derivative-coordinates phase spaces for multivariate time series classification: an application to signature verification. In Advances in Biometrics, Lecture Notes in Computer Science 5558, pages 1200–1208, Heidelberg, 2009. [ bib ]
[762] A. Humm, R. Ingold, and J. Hennebert. Spoken handwriting for user authentication using joint modelling systems. In Proceedings of 6th International Symposium on Image and Signal Processing and Analysis (ISPA'09), Salzburg (Austria), 2009. [ bib ]
[763] A. Humm, J. Hennebert, and R. Ingold. Combined handwriting and speech modalities for user authentication. IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans, 39, 2009. [ bib ]
[764] V. Frinken, K. Riesen, and H. Bunke. Improving graph classification by isomap. In A. Torsello, F. Escolano, and L. Brun, editors, Graph-Based Representations in Pattern Recognition, LNCS 5534, pages 205–214. Springer, 2009. [ bib ]
[765] D. Gelbart, N. Morgan, and A. Tsymbal. Hill-climbing feature selection for multi-stream asr. In 10th International Conference of the International Speech Communication Association, Brighton, UK, 2009. [ bib ]
[766] H. Hung and S. Ba. Speech/non-speech detection in meetings from automatically extracted low resolution visual features. Idiap-RR Idiap-RR-20-2009, Idiap, 2009. submitted to icmi-mlmi. [ bib | .pdf | Abstract ]
[767] S. Y. Zhao, R. Ravuri, and N. Morgan. Multi-stream to many-stream: using spectro-temporal features for asr. In 10th International Conference of the International Speech Communication Association, Brighton, UK, 2009. [ bib ]
[768] A. Drygajlo, W. Li, and K. Zhu. Q-stack aging model for face verification. In 17th European Signal Processing Conference, Glasgow, UK, 2009. [ bib ]
[769] M. Soleymani, G. Chanel, J. Kierkels, and T. Pun. affective characterization of movie scenes based on content analysis and physiological changes. To appear in International Journal of Semantic Computing, 2009. (to appear). [ bib ]
[770] J. P. Pinto, G. S. V. S. Sivaram, H. Hermansky, and M. Magimai-Doss. Volterra series for analyzing mlp based phoneme posterior probability estimator. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2009. [ bib | .pdf | Abstract ]
[771] A. Popescu-Belis and A. Vinciarelli. Multimedia meeting processing and retrieval at the idiap research institute. Informer (Newsletter of the BCS Information Retrieval Specialist Group), 29:14–16, 2009. [ bib ]
[772] D. Gatica-Perez. Automatic nonverbal analysis of social interaction in small groups: a review. In Image and Vision Computing, Special Issue on Human Naturalistic Behavior, in press, 2009. [ bib ]
[773] J. Dines, L. Saheer, and H. Liang. Speech recognition with speech synthesis models by marginalising over decision tree leaves. In Proceedings of Interspeech, 2009. [ bib | .pdf | Abstract ]
[774] C. McCool and S. Marcel. Parts-based face verification using local frequency bands. In in Proceedings of IEEE/IAPR International Conference on Biometrics, 2009. [ bib | .pdf ]
[775] S. H. K. Parthasarathi, M. Magimai-Doss, H. Bourlard, and D. Gatica-Perez. Investigating privacy-sensitive features for speech detection in multiparty conversations. In Proceedings of Interspeech 2009, 2009. [ bib | .pdf | Abstract ]
[776] J. Yao and J. M. Odobez. Fast human detection in videos using joint appearance and foreground learning from covariances of image feature subsets. Idiap-RR Idiap-RR-19-2009, Idiap, 2009. [ bib | .pdf | Abstract ]
[777] V. Frinken and H. Bunke. Self-training strategies for handwriting word recognition. In Proc. Industrial Conf. Advances in Data Mining. Applications and Theoretical Aspects, LNCS 5633, pages 291–300. Springer, 2009. [ bib ]
[778] F. De Simone, F. Dufaux, T. Ebrahimi, C. Delogu, and V. Baroncini. A subjective study of the influence of color information on visual quality assessment of high resolution pictures. In Fourth International Workshop on Video Processing and Quality Metrics for Consumer Electronics (VPQM-09), 2009. [ bib | http | Abstract ]
[779] J. Keshet, S. Shalev-Shwartz, Y. Singer, and D. Chazan. A large margin algorithm for forced alignment. In J. Keshet and S. Bengio, editors, Automatic Speech and Speaker Recognition: Large Margin and Kernel Methods. John Wiley and Sons, 2009. [ bib | Abstract ]
[780] G. Aradilla, H. Bourlard, and M. Magimai-Doss. Posterior features applied to speech recognition tasks with user-defined vocabulary. In Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2009. [ bib | .pdf ]
[781] D. Grangier, J. Keshet, and S. Bengio. Discriminative keyword spotting. In J. Keshet and S. Bengio, editors, Automatic Speech and Speaker Recognition: Large Margin and Kernel Methods. John Wiley and Sons, 2009. [ bib | Abstract ]
[782] B. Dumas, D. Lalanne, and R. Ingold. Benchmarking fusion engines of multimodal interactive systems. In Proceedings of International Conference on Multimodal Interfaces and Workshop on Machine Learning for Multi-modal Interaction (ICMI-MLMI 2009), Cambridge (MA) (USA), 2009. [ bib ]
[783] S. Thomas, S. Ganapathy, and H. Hermansky. Phoneme recognition using spectral envelope and modulation frequency features. Idiap-RR Idiap-RR-04-2009, Idiap, 2009. [ bib | .pdf | Abstract ]
[784] J. Keshet, D. Grangier, and S. Bengio. Discriminative keyword spotting. Speech Communication, 51(4):317–329, 2009. [ bib | .pdf ]
[785] X. Perrin, F. Colas, C. Pradalier, and R. Siegwart. Learning to identify users and predict their destination in a robotic guidance application. In Field and Service Robotics (FSR), Cambridge, MA, 2009. [ bib ]
[786] K. Kryszczuk and A. Drygajlo. Improving biometric verification with class-independent quality information. volume 3, pages 310–321, 2009. [ bib ]
[787] I. Ivanov, F. Dufaux, T. M. Ha, and T. Ebrahimi. Towards generic detection of unusual events in video surveillance. In 6th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSSâ09), 2009. [ bib | http | Abstract ]
[788] S. Voloshynovskiy, O. Koval, F. Beekhof, and T. Pun. Random projections based item authentication. In Electronic Imaging 2009, 2009. [ bib ]
[789] J. Galbally, C. McCool, J. Fierrez, S. Marcel, and J. Ortega-Garcia. Hill-climbing attack to an eigenface-based face verification system. In Proceedings of the First IEEE International Conference on Biometrics, Identity and Security (BIdS), 2009. [ bib | .pdf | Abstract ]
[790] S. Lefèvre and J. M. Odobez. Structure and appearance features for robust 3d facial actions tracking. In International Conference on Multimedia and Expo (ICME), 2009. [ bib ]
[791] K. Ali, F. Fleuret, D. Hasler, and P. Fua. Joint learning of pose estimators and features for object detection. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2009. (to appear). [ bib ]
[792] F. Fleuret. Multi-layer boosting for pattern recognition. Pattern Recognition Letters (PRL), 30:237–241, 2009. [ bib ]
[793] D. Vijayasenan, F. Valente, and H. Bourlard. Mutual information based channel selection for speaker diarization of meetings data. In Proceedings of International Conference on Acoustics, Speech and Signal Processing, 2009. [ bib | .pdf | Abstract ]
[794] E. Bruno and S. Marchand-Maillet. Multimodal preference aggregation for multimedia information retrieval. To appear in Journal of Multimedia, 2009. [ bib | .pdf ]
[795] B. Raducanu and D. Gatica-Perez. You are fired! nonverbal role analysis in competitive meetings. In Proc. ICASSP, Taiwan, 2009. [ bib ]
[796] J. L. Bloechle, D. Lalanne, and R. Ingold. Ocd: an optimized and canonical document format. In Proceedings of 10th IEEE International Conference on Document Analysis and Recognition (ICDAR 2009), pages 236–240, Barcelona (Spain), 2009. [ bib ]
[797] P. Motlicek. Automatic out-of-language detection based on confidence measures derived fromlvcsr word and phone lattices. In 10thAnnual Conference of the International Speech Communication Association [630], pages 1215–1218. [ bib | .pdf | Abstract ]
[798] D. Jayagopi and D. Gatica-Perez. Discovering group nonverbal conversational patterns with topics. In accepted for publication in Proc. ICMI-MLMI, Boston, USA, 2009. [ bib ]
[799] J. Berclaz, F. Fleuret, and P. Fua. Multiple object tracking using flow linear programming. Technical Report 10-2009, IDIAP Research Institute, 2009. [ bib ]
[800] N. Garg, B. Favre, K. Riedhammer, and D. Hakkani-Tur. Clusterrank: a graph based method for meeting summarization. In 10th International Conference of the International Speech Communication Association, Brighton, UK, 2009. [ bib ]
[801] F. Beekhof, S. Voloshynovskiy, O. Koval, and T. Holotyak. Multi-class classifiers based on binary classifiers: performance, efficiency, and minimum coding matrix distances. In MLSP 2009, 2009. [ bib ]
[802] P. N. Garner. A map approach to noise compensation of speech. Idiap-RR Idiap-RR-08-2009, Idiap, 2009. [ bib | .pdf | Abstract ]
[803] A. Popescu-Belis, J. Carletta, J. Kilgour, and P. Poller. Accessing a large multimodal corpus using an automatic content linking device. In M. Kipp, J. C. Martin, P. Paggio, and D. Heylen, editors, Multimodal Corpora, LNAI. Springer-Verlag, Berlin/Heidelberg, 2009. [ bib ]
[804] E. Ricci and J. M. Odobez. Real-time simultaneous head tracking and pose estimation. In IEEE International Conference on Image Processing (ICIP), 2009. [ bib ]
[805] S. Duffner, J. M. Odobez, and E. Ricci. Dynamic partitioned sampling for tracking with discriminative features. In Proceedings of the British Maschine Vision Conference, 2009. [ bib | .pdf | Abstract ]
[806] M. Wöllmer, F. Eyben, J. Keshet, A. Graves, B. Schuller, and G. Rigoll. Robust discriminative keyword spotting for emotionally colored spontaneous speech using bidirectional lstm networks. In IEEE International Conference on Acoustic, Speech, and Signal Processing, Taipei, Taiwan, 2009. [ bib | .pdf | Abstract ]
[807] G. Friedland, C. Yeo, and H. Hung. Visual speaker localization aided by acoustic models. In ACM Multimedia, 2009. [ bib | Abstract ]
[808] J. Keshet. A proposal for a kernel-based algorithm for large vocabulary continuous speech recognition. In J. Keshet and S. Bengio, editors, Automatic Speech and Speaker Recognition: Large Margin and Kernel Methods. John Wiley and Sons, 2009. [ bib | Abstract ]
[809] F. Valente, M. Magimai-Doss, C. Plahl, and R. Suman. Hierarchical processing of the modulation spectrum for gale mandarin lvcsr system. In Proceedings of the 10thAnnual Conference of the International Speech Communication Association (Interspeech), 2009. [ bib | .pdf | Abstract ]
[810] G. Gonzalez, F. Fleuret, and P. Fua. Learning rotational features for filament detection. In Proceedings of the IEEE international conference on Computer Vision and Pattern Recognition (CVPR), 2009. (to appear). [ bib ]
[811] K. Kumatani, J. McDonough, B. Rauch, P. N. Garner, W. Li, and J. Dines. Maximum kurtosis beamforming with the generalized sidelobe canceller. In Proceedings of INTERSPEECH, September 2008, 2009. [ bib | .pdf | Abstract ]
[812] S. Marchand-Maillet, E. Szekely, and E. Bruno. Optimizing strategies for the exploration of social-networks and associated data collections. In Proceedings of the International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS'09) - Special session on "People, Pixels, Peers: Interactive Content in Social Networks", London, UK, 2009. (invited). [ bib | .pdf ]
[813] E. Bertini, D. Lalanne, and M. Rigamonti. Extended excentric labeling. International Journal of the Eurographics Association, 28, 2009. [ bib ]
[814] J. Baker, L. Deng, J. Glass, S. Khudanpur, C. H. Lee, N. Morgan, and D. O'Shgughnessy. Research developments and directions in speech recognition and understanding. IEEE Signal Processing Magazine, 26(4):78–85, 2009. [ bib ]
[815] N. Noceti, B. Caputo, C. Castellini, L. Baldassarre, A. Barla, L. Rosasco, F. Odone, and G. Sandini. Towards a theoretical framework for learning multi-modal patterns for embodied agents. In International Conference on Image Analysis and Processing, 2009. [ bib | .pdf ]
[816] A. Popescu-Belis, P. Poller, J. Kilgour, E. Boertjes, J. Carletta, S. Castronovo, M. Fapso, M. Flynn, A. Nanchen, T. Wilson, J. de Wit, and M. Yazdani. A multimedia retrieval system using speech input. In ICMI-MLMI 2009 (11th International Conference on Multimodal Interfaces and 6th Workshop on Machine Learning for Multimodal Interaction), Cambridge, MA, 2009. [ bib ]
[817] D. Jayagopi. Modeling dominance in group conversations using nonverbal activity cues. IEEE Trans. on Audio, Speech, and Language Processing, Special Issue on Multimodal Processing for Speech-based Interactions, 17:501–513, 2009. [ bib ]
[818] J. S. Lee, F. De Simone, and T. Ebrahimi. Video coding based on audio-visual attention. In IEEE International Conference on Multimedia and Expo (ICME'09), 2009. [ bib | http | Abstract ]
[819] P. Rajan, S. H. K. Parthasarathi, and H. Murthy. Robustness of phase based features for speaker recognition. In Proceedings of Interspeech [845]. [ bib | .pdf | Abstract ]
[820] M. Baechler, J. L. Bloechle, A. Humm, R. Ingold, and J. Hennebert. Labeled images verification using gaussian mixture models. In Proceedings of 24th Annual ACM Symposium on Applied Computing (ACM SAC'09), pages 1331–1336, Honolulu, Hawaii (USA), 2009. [ bib ]
[821] S. Ba, H. Hung, and J. M. Odobez. Visual activity context for focus of attention estimation in dynamic meetings. In IEEE Proc. Int. Conf. on Multimedia and Expo (ICME), New-York, 2009. [ bib ]
[822] D. Lalanne, L. Nigay, P. Palanque, P. Robinson, J. Vanderdonckt, and J. F. Ladry. Fusion engines for multimodal interfaces: a survey. In Proceedings of International Conference on Multimodal Interfaces and Workshop on Machine Learning for Multi-modal Interaction (ICMI-MLMI 2009), Cambridge (MA) (USA), 2009. [ bib ]
[823] E. Bruno and S. Marchand-Maillet. multiview clustering: a late fusion approach using latent models. In Proceedings of the 32nd ACM Special Interest Group on Information Retrieval Conference, SIGIR 09, Boston, USA, 2009. [ bib ]
[824] B. Picart. Improved phone posterior estimation through k-nn and mlp-based similarity. Idiap-RR Idiap-RR-18-2009, Idiap, Rue Marconi 19, 1920 Martigny - switzerland, 2009. [ bib | .pdf | Abstract ]
[825] M. Gurban and J. Ph. Thiran. Information theoretic feature extraction for audio-visual speech recognition. IEEE Trans. on Signal Processing, in press, 2009. [ bib ]
[826] G. Bologna, S. Malandain, B. Deville, and T. Pun. The multi-touch see color interface. In ICTA 2009, The 2nd International Conference on Information and Communication Technologies and Accessibility, Hammamet, Tunisia, May 7–9, 2009. [ bib ]
[827] X. Perrin, R. Chavarriaga, C. Pradalier, J. del R. Millán, and R. Siegwart. Dialog management technique for brain-computer interfaces. Technical report, Autonomous Systems Lab, ETHZ, 2009. [ bib ]
[828] J. Yao and J. M. Odobez. Multi-camera multi-person 3d space tracking with mcmc in surveillance scenarios. In European Conference on Computer Vision, workshop on Multi Camera and Multi-modal Sensor Fusion Algorithms and Applications (ECCV-M2SFA2), 2009. [ bib | .pdf | Abstract ]
[829] B. Caputo, E. Hayman, M. Fritz, and J. O Ekluhnd. Classifying material in the real world. Image and vision Computing, accepted for pub, 2009. [ bib ]
[830] H. Salamin, S. Favre, and A. Vinciarelli. Automatic role recognition in multiparty recordings: Using social affiliation networks for feature extraction. IEEE Transactions on Multimedia, To Appear, 2009. [ bib | .pdf | Abstract ]
[831] N. Garg and D. Gatica-Perez. Tagging and retrieving images with co-occurrence models: from corel to flickr. Idiap-RR Idiap-RR-21-2009, Idiap, 2009. [ bib | .pdf | Abstract ]
[832] E. Bertini and D. Lalanne. Surveying the complementary roles of automatic data analysis and visualization in knowledge discovery. In Proceedings of ACM SIGKDD Workshop on Visual Analytics and Knowledge Discovery, VAKD '09, 15th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (VAKD 2009), pages 12–20, Paris (France), 2009. [ bib ]
[833] D. Jayagopi, R. Bogdan, and D. Gatica-Perez. Characterising conversationsal group dynamics using nonverbal behaviour. In Proceedings ICME 2009, 2009. [ bib | .pdf | Abstract ]
[834] K. Zhu, A. Drygajlo, and W. Li. Q-stack aging model for face verification. Glasgow, UK, 2009. [ bib ]
[835] N. Garg. Co-occurrence models for image annotation and retrieval. Idiap-RR Idiap-RR-22-2009, Idiap, 2009. Ecole Polytechnique Fédérale de Lausanne - Master Thesis. [ bib | .pdf | Abstract ]
[836] F. Orabona, J. Keshet, and B. Caputo. Bounded kernel-based perceptrons. Journal of Machine Learning Research, Accepted for pub, 2009. [ bib ]
[837] D. Vijayasenan, F. Valente, and H. Bourlard. Kl realignment for speaker diarization with multiple feature streams. In 10th Annual Conference of the International Speech Communication Association, 2009. [ bib | Abstract ]
[838] M. Pantic and A. Vinciarelli. Implicit human centered tagging. IEEE Signal Processing Magazine, 26, 2009. [ bib | .pdf ]
[839] F. Valente. A novel criterion for classifiers combination in multistream speech recognition. IEEE Signal Processing Letters, 16(7):561–564, 2009. [ bib | DOI | .pdf | Abstract ]
[840] J. Richiardi, A. Drygajlo, and K. Kryszczuk. Static models of derivative-coordinates phase spaces for multivariate time series classification: an application to signature verification. pages 140–149, Alghero, Italy, 2009. [ bib ]
[841] F. Orabona, C. Castellini, B. Caputo, A. E. Fiorilla, and G. Sandini. Model adaptation with least-squares svm for adaptive hand prosthetics. Idiap-RR Idiap-RR-05-2009, Idiap, March 2009. Accepted in ICRA09. [ bib | .pdf | Abstract ]
[842] Raducanu Bogdan, Vitria J., and Daniel Gatica-Perez. You are fired! nonverbal role analysis in competitive meetings. In Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), Taiwan., April 2009. [ bib | .pdf | Abstract ]
[843] D. Vijayasenan, F. Valente, and H. Bourlard. Mutual information based channel selection for speaker diarization of meetings data. In Proceedings of International conference on acoustics speech and signal processing, April 2009. [ bib | Abstract ]
[844] W. Li, J. Dines, M. Magimai-Doss, and H. Bourlard. Non-linear mapping for multi-channel speech separation and robust overlapping speech recognition. In Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), April 2009. [ bib | .pdf | Abstract ]
[845] Padmanabhan Rajan, Sree Hari Krishnan Parthasarathi, and Hema A Murthy. Robustness of phase based features for speaker recognition. Idiap-RR Idiap-RR-14-2009, Idiap, June 2009. [ bib | .pdf | Abstract ]
[846] N. Scaringella. On the design of audio features robust to the album-effect for music information retrieval. PhD thesis, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland, June 2009. Thèse EPFL, no 4412 (2009). Dir.: Hervé Bourlard. [ bib | Abstract ]
[847] K. Kumatani, J. McDonough, Barbara Rauch, D. Klakow, P. N. Garner, and Weifeng Li. Beamforming with a maximum negentropy criterion. IEEE Transactions on Audio Speech and Language Processing, 17(5):994–1008, July 2009. [ bib | .pdf | Abstract ]
[848] Philip N. Garner. Snr features for automatic speech recognition. Idiap-RR Idiap-RR-25-2009, Idiap, September 2009. [ bib | .pdf | Abstract ]
[849] A. Roy and S. Marcel. Haar local binary pattern feature for fast illumination invariant face detection. Idiap-RR Idiap-RR-28-2009, Idiap, September 2009. [ bib | .pdf | Abstract ]
[850] A. Roy and S. Marcel. Haar local binary pattern feature for fast illumination invariant face detection. In British Machine Vision Conference 2009 [849]. [ bib | .pdf | Abstract ]
[851] A. Vinciarelli, A. Dielmann, S. Favre, and H. Salamin. Canal9: A database of political debates for analysis of social interactions. In Proceedings of the International Conference on Affective Computing and Intelligent Interaction (IEEE International Workshop on Social Signal Processing), pages 1–4, September 2009. Publication Date: 10-12 Sept. 2009. [ bib | DOI | .pdf | Abstract ]
[852] J. Dines, J. Yamagishi, and S. King. Measuring the gap between hmm-based asr and tts. In Proceedings of Interspeech, September 2009. [ bib | .pdf | Abstract ]
[853] P. Motlicek, S. Ganapathy, and H. Hermansky. Arithmetic coding of sub-band residuals in fdlp speech/audio codec. In 10th Annual Conference of the International Speech Communication Association, pages 2591–2594. ISCA, ISCA 2009, September 2009. [ bib | .pdf | Abstract ]
[854] Joan-Isaac Biel and D. Gatica-Perez. Wearing a youtube hat: directors, comedians, gurus, and user aggregated behavior. In Proceedings of the 17th ACM International Conference on Multimedia, pages 833–836. ACM, October 2009. [ bib | .pdf | Abstract ]
[855] Jagannadan Varadarajan and J. M. Odobez. Topic models for scene analysis and abnormality detection. In 9th International Workshop in Visual Surveillance. IEEE, IEEE, October 2009. [ bib | .pdf ]
[856] Edgar Roman-Rangel, Carlos Pallan, J. M. Odobez, and D. Gatica-Perez. Retrieving ancient maya glyphs with shape context. In 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops. IEEE, October 2009. [ bib | .pdf | Abstract ]
[857] S. Ganapathy, P. Motlicek, and H. Hermansky. Mdct for encoding residual signals in frequency domain linear prediction. In Audio Engineering Society (AES), 127th Convention, number Preprint 7921 in 127th Convention, New York, USA, October 2009. Audio Engineering Society (AES), Audio Engineering Society, 60 East 42nd Street, New York, New York 10165-2520, USA;. [ bib | http | .pdf | Abstract ]
[858] E. Ricci and J. M. Odobez. Learning large margin likelihood for realtime head pose tracking. In IEEE Int. Conference on Image Processing, Cairo, Egypt. IEEE, October 2009. [ bib | .pdf ]
[859] R. A. Negoescu, B. Adams, D. Phung, S. Venkatesh, and D. Gatica-Perez. Flickr hypergroups. In Proceedings of the 17th ACM International Conference on Multimedia, October 2009. [ bib | .pdf | Abstract ]
[860] S. Ganapathy, S. Thomas, P. Motlicek, and H. Hermansky. Applications of signal analysis using autoregressive models for amplitude modulation. In IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2009, WASPAA '09., pages 341–344. IEEE, October 2009. Digital Object Identifier 10.1109/ASPAA.2009.534649. [ bib | http | .pdf | Abstract ]
[861] D. Korchagin. Out-of-scene av data detection. Idiap-RR Idiap-RR-31-2009, Idiap, P.O. Box 592, CH-1920 Martigny, Switzerland, November 2009. [ bib | .pdf | Abstract ]
[862] Dairazalia Sanchez-Cortes, D. Jayagopi, and D. Gatica-Perez. Predicting remote versus collocated group interactions using nonverbal cues. In Proc. Int. Conf. on Multimodal Interfaces, Workshop on Multimodal Sensor-Based Systems and Mobile Phones for Social Computing,, November 2009. [ bib | DOI | Abstract ]
[863] D. Korchagin. Out-of-scene av data detection. In Proceedings IADIS International Conference Applied Computing [861], pages 244–248. [ bib | .pdf | Abstract ]
[864] D. Korchagin. Multimodal data flow controller. Idiap-Com Idiap-Com-01-2009, Idiap, P.O. Box 592, CH-1920 Martigny, Switzerland, November 2009. [ bib | .pdf | Abstract ]
[865] C. McCool and S. Marcel. Mobio database for the icpr 2010 face and speech competition. Idiap-Com Idiap-Com-02-2009, Idiap, November 2009. [ bib | .pdf | Abstract ]
[866] M. Pronobis and M. Magimai-Doss. Analysis of f0 and cepstral features for robust automatic gender recognition. Idiap-RR Idiap-RR-30-2009, Idiap, November 2009. [ bib | .pdf | Abstract ]
[867] D. Korchagin, P. N. Garner, and J. Dines. Automatic temporal alignment of av data with confidence estimation. Idiap-RR Idiap-RR-40-2009, Idiap, CH-1920 Martigny, Switzerland, December 2009. [ bib | .pdf | Abstract ]
[868] J. Luo, B. Caputo, and V. Ferrari. Who's doing what: Joint modeling of names and verbs for simultaneous face and pose annotation. In Advances in Neural Information Processing Systems 22 (NIPS09). NIPS Foundation, MIT Press, December 2009. [ bib | .pdf | Abstract ]
[869] A. Popescu-Belis, P. Poller, J. Kilgour, M. Flynn, Sebastian Germesin, A. Nanchen, and M. Yazdani. User interface design in a just-in-time retrieval system for meetings. Idiap-RR Idiap-RR-38-2009, Idiap, December 2009. [ bib | .pdf | Abstract ]
[870] Serena Soldo, M. Magimai-Doss, J. P. Pinto, and H. Bourlard. On mlp-based posterior features for template-based asr. Idiap-RR Idiap-RR-37-2009, Idiap, December 2009. [ bib | .pdf | Abstract ]
[871] D. Korchagin. Memoirs of togetherness from audio logs. In Proceedings International ICST Conference on User Centric Media, P.O. Box 592, CH-1920 Martigny, Switzerland, December 2009. [ bib | .pdf | Abstract ]
[872] J. P. Pinto, M. Magimai-Doss, and H. Bourlard. Mlp based hierarchical system for task adaptation in asr. In Proceedings of the IEEE workshop on Automatic Speech Recognition and Understanding, pages 365–370, December 2009. [ bib | .pdf | Abstract ]
[873] D. Korchagin, P. N. Garner, and J. Dines. Automatic temporal alignment of av data. Idiap-RR Idiap-RR-39-2009, Idiap, December 2009. [ bib | .pdf | Abstract ]
[874] Lakshmi Saheer, John Dines, Philip N. Garner, and Hui Liang. Implementation of vtln for statistical speech synthesis. Idiap-RR Idiap-RR-32-2010, Idiap, 2010. [ bib | .pdf | Abstract ]
[875] Jagannadan Varadarajan, Remi Emonet, and Jean-Marc Odobez. Probabilistic latent sequential motifs: Discovering temporal activity patterns in video scenes. Idiap-RR Idiap-RR-33-2010, Idiap, 2010. [ bib | .pdf | Abstract ]
[876] John Dines, Junichi Yamagishi, and Simon King. Measuring the gap between hmm-based asr and tts. Idiap-RR Idiap-RR-34-2010, Idiap, 2010. [ bib | .pdf ]
[877] David Imseng and Gerald Friedland. Tuning-robust initialization methods for speaker diarization. Idiap-RR Idiap-RR-35-2010, Idiap, Centre du Parc, Rue Marconi 19, Case Postale 592, CH-1920 Martigny, 2010. [ bib | .pdf | Abstract ]
[878] Jagannadan Varadarajan, Remi Emonet, and Jean-Marc Odobez. A sparsity constraint for topic models - application to temporal activity mining. Idiap-RR Idiap-RR-36-2010, Idiap, 2010. [ bib | .pdf | Abstract ]
[879] Joel Praveen Pinto, Mathew Magimai.-Doss, and Hervé Bourlard. Hierarchical tandem features for asr in mandarin. Idiap-RR Idiap-RR-39-2010, Idiap, 2010. [ bib | .pdf | Abstract ]
[880] Joan-Isaac Biel and Daniel Gatica-Perez. Vlogcast yourself: Nonverbal behavior and attention in social media. In Proceedings International Conference on Multimodal Interfaces (ICMI-MLMI), 2010. [ bib ]
[881] Gelareh Mohammadi, Alessandro Vinciarelli, and Marcello Mortillaro. The voice of personality: Mapping nonverbal vocal behavior into trait attributions. In Proceedings of ACM Multimedia Workshop on Social Signal Processing, 2010. [ bib ]
[882] V Murino, M Cristani, and Alessandro Vinciarelli. Socially intelligent surveillance and monitoring: Analysing social dimensions of physical space. In Proceedings of International Workshop on Socially Intelligent Surveillance and Monitoring, pages 51–58, San Francisco, 2010. [ bib ]
[883] Alessandro Vinciarelli and Fabio Valente. Social signal processing: Understanding nonverbal communication in social interactions. In Proceedings of Measuring Behavior 2010, Eindhoven (The Netherlands), 2010. [ bib ]
[884] Dinesh Babu Jayagopi, Taemie Kim, Alex Pentland, and Daniel Gatica-Perez. Recognizing conversational context in group interaction using privacy-sensitive mobile sensors. In Proceedings of International Conference on Mobile and Ubiquitous Multimedia, Limassol, Cyprus, 2010. [ bib ]
[885] Alessandro Vinciarelli, Roderick Murray-Smith, and Hervé Bourlard. Mobile social signal processing: vision and research issues. In Proceedings of the International Workshop on Mobile HCI, pages 513–516, Lisbon, 2010. [ bib ]
[886] Katayoun Farrahi and Daniel Gatica-Perez. Mining human location-routines using a multi-level approach to topic modeling. In 2010 IEEE Second International Conference on Social Computing, SIN Symposium, Minneapolis, Minnesota, USA, 2010. [ bib ]
[887] Fabio Valente and Alessandro Vinciarelli. Improving speech processing trough social signals: Automatic speaker segmentation of political debates using role based turn-taking patterns. In Proceedings of ACM Multimedia Workshop on Social Signal Processing, 2010. [ bib ]
[888] Dairazalia Sanchez-Cortes, Oya Aran, Marianne Schmid Mast, and Daniel Gatica-Perez. Identifying emergent leadership in small groups using nonverbal communicative cues. In Proc. ICMI-MLMI '10 International Conference on Multimodal Interfaces and the Workshop on Machine Learning for Multimodal Interaction, Beijing, 2010. ACM New York, NY, USA 2010. [ bib ]
[889] Raul. Montoliu and Daniel Gatica-Perez. Discovering human places of interest from multimodal mobile phone data. In Proceedings of 9th International Conference on on Mobile and Ubiquitous Multimedia, Limassol, Cyprus, 2010. [ bib ]
[890] Trinh-Minh-Tri Do and Daniel Gatica-Perez. By their apps you shall understand them: mining large-scale patterns of mobile phone usage. In The 9th International Conference on Mobile and Ubiquitous Multimedia, 2010. [ bib ]
[891] Andrei Popescu-Belis, Jonathan Kilgour, Peter Poller, Alexandre Nanchen, Erik Boertjes, and Joost de Wit. Automatic content linking: Speech-based just-in-time retrieval for multimedia archives. In Proceedings of the 33rd Annual ACM SIGIR Conference, page 703, 2010. [ bib ]
[892] Stéphanie Lefèvre and Jean-Marc Odobez. View-based appearance model online learning for 3d deformable face tracking. In Proc. Int. Conf. on Computer Vision Theory and Applications, Angers, 2010. [ bib ]
[893] Mikko Kurimo, William Byrne, John Dines, Philip N. Garner, Matthew Gibson, Yong Guan, Teemu Hirsimüaki, Reima Karhila, Simon King, Hui Liang, Keiichiro Oura, Lakshmi Saheer, Matt Shannon, Sayaka Shiota, Jilei Tian, Keiichi Tokuda, Mirjam Wester, Yi-Jian Wu, and Junichi Yamagishi. Personalising speech-to-speech translation in the emime project. In Proceedings of the ACL 2010 System Demonstrations, Uppsala, Sweden, 2010. Association for Computational Linguistics. [ bib ]
[894] Jagannadan Varadarajan, Remi Emonet, and Jean-Marc Odobez. A sparsity constraint for topic models - application to temporal activity mining. In NIPS-2010 Workshop on Practical Applications of Sparse Modeling: Open Issues and New Directions, 2010. [ bib ]
[895] Fabio Valente, Mathew Magimai.-Doss, Christian Plahl, Ravuri Suman, and Wang Wen. A comparative study of mlp front-ends for mandarin asr. In Proceedings of Interspeech, Japan, 2010. [ bib ]
[896] Alessandro Vinciarelli. Human Behavior Understanding. Springer Verlag, 2010. [ bib ]
[897] Alessandro Vinciarelli and Maja Pantic. www.sspnet.eu: A web portal for social signal processing. IEEE Signal Processing Magazine, 27(4):142–144, 2010. [ bib ]
[898] Fabio Valente. Hierarchical and parallel processing of auditory and modulation frequencies for automatic speech recognition. Speech Communication, 52(10), 2010. [ bib ]
[899] Fabio Valente. Multi-stream speech recognition based on dempster-shafer combination rule. Speech Communication, 52(3), 2010. [ bib ]
[900] J. S. Lee, F. De Simone, and T. Ebrahimi. Video coding based on audio-visual focus of attention. Journal of Visual Communication and Image Representation, 2010. [ bib ]
[901] Apostolos Antonacopoulos, Michael J. Gormish, and Rolf Ingold, editors. Proceedings of the 2010 ACM Symposium on Document Engineering, Manchester, United Kingdom, September 21-24, 2010. ACM, 2010. [ bib ]
[902] Karim Hadjar and Rolf Ingold. Improving xed for extracting content from arabic pdfs. In Document Analysis Systems, pages 371–376, 2010. [ bib ]
[903] Florian Verdet, Driss Matrouf, Jean-François Bonastre, and Jean Hennebert. Channel detectors for system fusion in the context of nist lre 2009. In INTERSPEECH, pages 733–736, 2010. [ bib ]
[904] Dalila Mekhaldi and Denis Lalanne. Multimodal document alignment: Feature-based validation to strengthen thematic links. Journal of Multimedia Processing Technologies, 1(1):30–46, 2010. [ bib ]
[905] Florian Evéquoz, Julien Thomet, and Denis Lalanne. Gérer son information personnelle au moyen de la navigation par facettes. In Conference Internationale Francophone sur I'Interaction Homme-Machine, IHM '10, pages 41–48. ACM, 2010. [ bib ]
[906] Bruno Dumas, Denis Lalanne, and Rolf Ingold. Description languages for multimodal interaction: a set of guidelines and its illustration with smuiml. Journal on Multimodal User Interfaces, 3(3):237–247, 2010. [ bib ]
[907] Pascal Bruegger, Agnes Lisowska, Denis Lalanne, and Beat Hirsbrunner. Enriching the design and prototyping loop: a set of tools to support the creation of activity-based pervasive applications. Journal of Mobile Multimedia, 6(4):339–360, 2010. [ bib ]
[908] Matthias Schwaller, Denis Lalanne, and Omar Abou Khaled. Pygmi: creation and evaluation of a portable gestural interface. In NordiCHI, pages 773–776, 2010. [ bib ]
[909] D. Morrison, E. Bruno, and S. Marchand-Maillet. Tagcaptcha: Annotating images with captchas. In ACM MULTIMEDIA 2010 (Demo Program), 2010. [ bib ]
[910] M. Soleymani and M. Larson. Crowdsourcing for affective annotation of video: development of a viewer-reported boredom corpus. In 33th ACM SIGIR, Workshop on Crowdsourcing for Search Evaluatio, 2010. [ bib ]
[911] B. Deville, G. Bologna, and T. Pun. Detecting objects and obstacles for visually impaired individuals using visual saliency. In ASSETS 2010, 12th Int. ACM SigAccess Conf. on Computers and Accessibility, Demonstrations Track, 2010. [ bib ]
[912] J. D. Gomez, G. Bologna, and T. Pun. Color-audio encoding interface for visual substitution: See color matlab-based demo. In ASSETS 2010, 12th Int. ACM SigAccess Conf. on Computers and Accessibility, Demonstrations Track, 2010. [ bib ]
[913] F. Crestani, S. Marchand-Maillet, H. H. Chen, E. N. Efthimiadis, and J. Savoy. Proceeding of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2010. ACM, New York, USA, 2010. [ bib ]
[914] G. Bologna, B. Deville, and T. Pun. Toward local and global perception modules for vision substitution. Neurocomputing, 74(8):1182–1190, 2010. [ bib ]
[915] S. Pellegrini, A. Ess, M. Tanaskovic, and L. Van Gool. Wrong turn - no dead end: a stochastic pedestrian motion model. In International Workshop on Socially Intelligent Surveillance and Monitoring (SISM), 2010. [ bib ]
[916] S. Pellegrini, A. Ess, and L. Van Gool. Improving data association by joint modeling of pedestrian trajectories and groupings. In European Conference on Computer Vision (ECCV), 2010. [ bib ]
[917] S. Stalder, H. Grabner, and L. Van Gool. Cascaded confidence filtering for improved tracking-by-detectio. In European Conference on Computer Vision (ECCV), 2010. [ bib ]
[918] S. Gammeter, T. Quack, D. Tingdahl, and Luc van Gool. Size does matter: improving object recognition and 3d reconstruction with cross-media analysis of image clusters. In European Conference on Computer Vision (ECCV 2010, 2010. [ bib ]
[919] N. Razavi, J. Gall, and Luc Van Gool. Backprojection revisited: Scalable multi-view object detection and similarity metrics for detections. In European Conference on Computer Vision, 2010. [ bib ]
[920] J. Gall, N. Razavi, and Luc Van Gool. On-line adaption of class-specific codebooks for instance trackin. In British Machine Vision Conference, 2010. [ bib ]
[921] J. Knopp, M. Prasad, G. Willems, R. Timofte, and L. Van Gool. Hough transform and 3D SURF for robust three dimensional classification. In Proceedings of the European Conference on Computer Vision, 2010. [ bib ]
[922] J. Knopp, M. Prasad, and L. Van Gool. Orientation invariant 3d object classification using hough transform based methods. In Proceedings of the ACM workshop on 3D object retrieval, 2010. [ bib ]
[923] G. Veres, H. Grabner, L. Middleton, and L. Van Gool. Automatic workflow monitoring in industrial environments. In Proceedings Asian Conference on Computer Vision (ACCV), 2010. [ bib ]
[924] G. Fanelli, A.Yao, P. L. Noel, J. Gall, and L. Van Gool. Hough forest-based facial expression recognition from video sequences. In International Workshop on Sign, Gesture and Activity (SGA) 2010, in conjunction with ECCV 2010, 2010. [ bib ]
[925] F. Nater, J. Vangeneugden, H. Grabner, L. Van Gool, and R. Vogels. Discrimination of locomotion direction at different speeds: A comparison between macaque monkeys and algorithms. In ECML Workshop on rare audio-visual cues, 2010. [ bib ]
[926] C. Lalos, H. Grabner, L. Van Gool, and T. Varvarigo. Object fow: Learning object displacement. In roceeding IEEE Workshop on Visual Surveillance, 2010. [ bib ]
[927] A. Yao, D. Uebersax, J. Gall, and L. Van Gool. Tracking in broadcast sports. In 32nd Annual Symposium of the German Association for Pattern Recognition, 2010. [ bib ]
[928] A. Mansfield, P. Gehler, L. Van Gool, and C. Rothe. Visibility maps for improving seam carving. In Media Retargeting Workshop, European Conference on Computer Vision (ECCV), 2010. [ bib ]
[929] A. Mansfield, P. Gehler, L. Van Gool, and C. Rother. Scene carving: Scene consistent image retargeting. In European Conference on Computer Vision (ECCV), 2010. [ bib ]
[930] M. D. Breitenstein, F. Reichlin, B. Leibe, E. Koller-Meier, and L. Van Gool. Online multi-person tracking-by-detection from a single, uncalibrated camera. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010. [ bib ]
[931] G. Fanelli, J. Gall, H. Romsdorfer, T. Weise, and L. Van Gool. A 3-d audio-visual corpus of affective communication. IEEE Transactions on Multimedia, 12(6):591–598, 2010. [ bib ]
[932] S. Haegler, P. Wonka, Stefan Mueller Arisona, Luc Van Gool, and P. Müller. Grammar-based encoding of facades. In EGSR, 2010. [ bib ]
[933] M. D. Breitenstein, B. Leibe, and Luc Van Gool. Evaluation of agent motion in video: Online tracking-by-detection. In International Conference on Cognitive Systems, 2010. [ bib ]
[934] J. Gall, N. Razavi, and L. Van Gool. On-line adaption of class-specific codebooks for instance tracking. In British Machine Vision Conference, 2010. [ bib ]
[935] J. Gall, A. Yao, and L. Van Gool. 2d action recognition serves 3d human pose estimation. In European Conference on Computer Vision, 2010. [ bib ]
[936] G. Fanelli, J. Gall, H. Romsdorfer, T. Weise, and L. Van Gool. 3d vision technology for capturing multimodal corpora: Chances and challenges. In LREC Workshop on Multimodal Corpora, 2010. [ bib ]
[937] Fabian Nater, Helmut Grabner, and Luc Van Gool. Visual abnormal event detection for prologed independent livin. In IEEE Healthcom Workshop on mHealth, 2010. [ bib ]
[938] Fabian Nater, Helmut Grabner, and Luc Van Gool. Exploiting simple hierarchies for unsupervised human behavior analysis. In CVPR, 2010. [ bib ]
[939] D. Kuettel, M. D. Breitenstein, Luc Van Gool, and V. Ferrari. What�s going on? discovering spatio-temporal dependencies in dynamic scenes. In IEEE Conference on Computer Vision and Pattern Recognition, 2010. [ bib ]
[940] A. Yao, J. Gall, and L. Van Gool. A hough transform-based voting framework for action recognition. In IEEE Conference on Computer Vision and Pattern Recognition, 2010. [ bib ]
[941] M. Sorci, G. Antonini, J. Cruz Mota, T. Rubin, M. Bierlaire, and J. Ph. Thiran. Modelling human perception of static facial expressions. Image and Vision Computing, 28(5):790–806, 2010. [ bib | DOI ]
[942] S. Koelstra, A. Yazdani, M. Soleymani, C. Muehl, J. S. Lee, A. Nijholt, T. Pun, T. Ebrahimi, and I. Patras. Single trial classification of eeg and peripheral physiological signals for recognition of emotions induced by music videos. In Brain Informatics, 2010. [ bib ]
[943] D. Morrison, E. Bruno, and S. Marchand-Maillet. Tagcaptcha: Annotating images with CAPTCHAs. In ACM Multimedia 2010, 2010. [ bib ]
[944] J. Kierkels, M. Soleymani, and T. Pun. Identification of narrative peaks in clips: text features perform best. In VideoCLEF 2009, Cross Language Evaluation Forum (CLEF) Workshop, Post-Conference Proceedings. Springer LNCS, 2010. [ bib ]
[945] I. Kompatsiaris, S. Marchand-Maillet, S. Marcel, and R. van Zwol. Image and Video Retrieval: Theory and Applications. Springer, 2010. [ bib ]
[946] D. Morrison, E. Bruno, and S. Marchand-Maillet. Capturing the semantics of user interaction: A review and case study. In R. Chbeir, Y. Badr, A. Abraham, and A. E. Hassanien, editors, Emergent Web Intelligence: Advanced Information Retrieval. Springer, 2010. [ bib ]
[947] S. Marchand-Maillet, D. Morrison, E. Szekely, and E. Bruno. Interactive representations of multimodal databases. In H. Bourlard, F. Marques, and J. Ph. Thiran, editors, Multimodal Signal Processing for Human Computer Interaction. Academis Press, 2010. [ bib ]
[948] Hsin-Hsi Chen, Efthimis N. Efthimiadis, Jacques Savoy, Fabio Crestani, and S. Marchand-Maillet. Proceedings of the ACM-SIGIR 2010 conference. ACM Digital Library, 2010. [ bib ]
[949] F. Evéquoz, Julien Thomet, and D. Lalanne. La navigation par facettes appliquée à la gestion de l'information personnelle. In Proceedings of 22ème Conférence Francophone sur l'Interaction Homme-Machine (IHM'10), 2010. [ bib ]
[950] Ilya Boyandin, E. Bertini, and D. Lalanne. Using flow maps to explore migrations over time. In Proceedings of Geospatial Visual Analytics Workshop in conjunction with The 13th AGILE International Conference on Geographic Information Science, 2010. [ bib ]
[951] L. Goldmann, F. De Simone, and T. Ebrahimi. A comprehensive database and subjective evaluation methodology for quality of experience in stereoscopic video. In Proceedings of SPIE, volume 7526, San Jose, California, USA, 2010. [ bib ]
[952] L. Goldmann, F. De Simone, and T. Ebrahimi. Impact of acquisition distortion on the quality of stereoscopic images. In Proceedings of International Workshop on Video Processing and Quality Metrics for Consumer Electronics, Scottsdale, Arizona, USA, 2010. [ bib ]
[953] F. De Simone, L. Goldmann, D. Filimonov, and T. Ebrahimi. On the limits of perceptually optimized JPEG. In Proceedings of International Workshop on Video Processing and Quality Metrics for Consumer Electronics, Scottsdale, Arizona, USA, 2010. [ bib ]
[954] F. De Simone, M. Tagliasacchi, M. Naccari, S. Tubaro, and T. Ebrahimi. A H.264/AVC video database for the evaluation of quality metrics. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, pages 2430–2433, Dallas, Texas, USA, 2010. [ bib ]
[955] I. Ivanov, P. Vajda, L. Goldmann, J. S. Lee, and T. Ebrahimi. Object-based tag propagation for semi-automatic annotation of images. In Proceedings of the ACM SIGMM International Conference on Multimedia Information Retrieval, pages 497–506, Philadelphia, USA, 2010. [ bib ]
[956] P. Vajda, I. Ivanov, L. Goldmann, J. S. Lee, and T. Ebrahimi. 3D object duplicate detection for video retrieval. In Proceedings of the International Workshop on Image Analysis for Multimedia Interactive Services, Desenzano del Garda, Italy, 2010. [ bib ]
[957] F. De Simone, L. Goldmann, J. S. Lee, T. Ebrahimi, and V. Baroncini. Subjective evaluation of next-generation video compression algorithm: a case study. In Proceedings of SPIE, volume 7798, San Diego, USA, 2010. [ bib ]
[958] P. Vajda, I. Ivanov, J. S. Lee, L. Goldmann, and T. Ebrahimi. Propagation of geotags based on object duplicate detection. In Proceedings of SPIE, volume 7798, San Diego, USA, 2010. [ bib ]
[959] I. Ivanov, P. Vajda, J. S. Lee, and T. Ebrahimi. Epitome- a social game for photo album summarization. In Proceedings of the International Workshop on Connected Multimedia, Firenze, Italy, 2010. [ bib ]
[960] S. Koelstra, A. Yazdani, M. Soleymani, C. Muehl, J. S. Lee, A. Nijholt, T. Pun, T. Ebrahimi, and I. Patras. Single trial classification of EEG and peripheral physiological signals for recognition of emotions induced by music videos. In Proceedings of the International Conference on Brain Informatics, Toronto, Canada, 2010. [ bib ]
[961] J. S. Lee, F. De Simone, N. Ramzan, Z. Zhao, E. Kurutepe, T. Sikora, J. Ostermann, E. Izquierdo, and T. Ebrahimi. Subjective evaluation of scalable video coding for content distribution. In Proceedings of the ACM Multimedia International Conference, Firenze, Italy, 2010. [ bib ]
[962] S. Buchinger, F. De Simone, E. Hotop, H. Hlavacs, and T. Ebrahimi. Gesture and touch controlled video player interface for mobile devices. In Proceedings of the ACM Multimedia International Conference, Firenze, Italy, 2010. [ bib ]
[963] I. Ivanov, P. Vajda, J. S. Lee, L. Goldmann, and T. Ebrahimi. Geotag propagation in social networks based on user trust model. Multimedia Tools and Application, 2010. [ bib ]
[964] P. Vajda, I. Ivanov, L. Goldmann, J. S. Lee, and T. Ebrahimi. Robust duplicate detection of 2D and 3D objects. International Journal of Multimedia Data Engineering and Management, 2010. [ bib ]
[965] Pierre Dillenbourg and Patrick Jermann. Technology for Classroom Orchestration. In M. S. Khine and I. M. Saleh, editors, New Science of Learning, pages 525–552. Springer Science Business Media, New York, 2010. [ bib | DOI | Abstract ]
[966] Alexander Sproewitz, Soha Pouya, Stéphane Bonardi, Jesse van den Kieboom, Rico Moeckel, A. Billard, Pierre Dillenbourg, and Auke Ijspeert. Roombots: Reconfigurable Robots for Adaptive Furniture. IEEE Computational Intelligence Magazine, special issue on "Evolutionary and developmental approaches to robotics", 2010. [ bib | DOI | Abstract ]
[967] Khaled Bachour, Frédéric Kaplan, and Pierre Dillenbourg. An Interactive Table for Supporting Participation Balance in Face-to-Face Collaborative Learning. IEEE Transactions on Learning Technologies, 2010. [ bib | DOI | .pdf | Abstract ]
[968] D. Jayagopi and D. Gatica-Perez. Mining group nonverbal conversational patterns using probabilistic topic models. IEEE Transactions on Multimedia, 2010. [ bib | .pdf | Abstract ]
[969] D. Gatica-Perez and J. M. Odobez. Visual attention, speaking activity, and group conversational analysis in multi-sensor environments. In In H. Nakashima, J. Augusto, H. Aghajan (Eds.), Handbook of Ambient Intelligence and Smart Environments. Springer, 2010. [ bib ]
[970] L. Saheer, P. N. Garner, J. Dines, and H. Liang. Vtln adaptation for statistical speech synthesis. In Proceedings of ICASSP, 2010. [ bib | .pdf | Abstract ]
[971] D. Vijayasenan, F. Valente, and H. Bourlard. Multistream speaker diarization beyond two acoustic feature streams. In International Conference on Acoustics, Speech, and Signal Processing, 2010. [ bib | .pdf | Abstract ]
[972] J. P. Pinto. Multilayer Perceptron Based Hierarchical Acoustic Modeling for Automatic Speech Recognition. PhD thesis, Ecole polytechnique fédérale de Lausanne, 2010. Thèse Ecole polytechnique fédérale de Lausanne EPFL, no 4649 (2010), Programme doctoral Génie électrique, Faculté des sciences et techniques de l'ingénieur STI, Institut de génie électrique et électronique IEL (Laboratoire de l'IDIAP LIDIAP). Dir.: Hervé Bourlard. [ bib | .pdf | Abstract ]
[973] S. Ba and J. M. Odobez. Multi-person visual focus of attention from head pose and meeting contextual cues. In IEEE Trans. on Pattern Analysis and Machine Intelligence, accepted for publication, november 2009 [334]. IDIAP-RR 08-47. [ bib | .pdf ]
[974] A. Roy and S. Marcel. Introducing crossmodal biometrics:person identification from distinct audio & visual streams. In IEEE Fourth International Conference on Biometrics: Theory, Applications and Systems, number 4, 2010. [ bib | .pdf | Abstract ]
[975] R. Bogdan and D. Gatica-Perez. Inferring competitive role patterns in reality tv show through nonverbal analysis. Multimedia Tools and Applications, Special issue on Social Media, 2010. [ bib | .pdf | Abstract ]
[976] A. Popescu-Belis. Finding without searching. Idiap-Com Idiap-Com-01-2010, Idiap, January 2010. [ bib | .pdf ]
[977] S. H. K. Parthasarathi, M. Magimai-Doss, H. Bourlard, and D. Gatica-Perez. Evaluating the robustness of privacy-sensitive audio features for speech detection in personal audio log scenarios. In ICASSP 2010, 2010. [ bib | .pdf | Abstract ]
[978] H. Hung, Y. Huang, G. Friedland, and D. Gatica-Perez. Estimating dominance in multi-party meetings using speaker diarization. IEEE Transactions on Audio, Speech, and Language Processing, 2010. [ bib | .pdf | Abstract ]
[979] S. Ganapathy, P. Motlicek, and H. Hermansky. Autoregressive models of amplitude modulations in audio compression. IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, 2010. [ bib | http | .pdf | Abstract ]
[980] Afsaneh Asaei, B. Picart, and H. Bourlard. Analysis of phone posterior feature space exploiting class specific sparsity and mlp-based similarity measure. In 2010 IEEE International Conference on Acoustics, Speech and Signal Processing, 2010. [ bib | .pdf ]
[981] J. P. Pinto, G. S. V. S. Sivaram, M. Magimai-Doss, H. Hermansky, and H. Bourlard. Analysis of mlp based hierarchical phoneme posterior probability estimator. IEEE Transcations on Audio, Speech, and Language Processing, 2010. [ bib | .pdf | Abstract ]
[982] Venkatesh Bala Subburaman and S. Marcel. An alternative scanning strategy to detect faces. In Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing, 2010. [ bib | .pdf | Abstract ]
[983] P. Motlicek, P. N. Garner, M. Guillemot, and Vincent Bozzo. Amida/klewel mini-project. Idiap-RR Idiap-RR-03-2010, Idiap, Rue Marconi 19, Martigny, January 2010. [ bib | .pdf | Abstract ]
[984] M. Yazdani and A. Popescu-Belis. A random walk framework to compute textual semantic similarity: a unified model for three benchmark tasks. In Proceedings of the 4th IEEE International Conference on Semantic Computing (ICSC 2010), Carnegie Mellon University, Pittsburgh, PA, USA, 2010. [ bib | .pdf ]
[985] Oya Aran, H. Hung, and D. Gatica-Perez. A multimodal corpus for studying dominance in small group conversations. In LREC workshop on Multimodal Corpora: Advances in Capturing, Coding and Analyzing Multimodality, Malta, May 2010, 2010. [ bib | .pdf ]
[986] P. Motlicek, S. Ganapathy, H. Hermansky, and H. Garudadri. Wide-band audio coding based on frequency domain linear prediction. EURASIP Journal on Audio Speech and Music Processing, 2010(856280):14, February 2010. Special Issue: Scalable Audio-Content Analysis. [ bib | DOI | .html | .pdf | Abstract ]
[987] A. Pronobis, J. Luo, and Barbara Caputo. The more you learn, the less you store: Memory-controlled incremental svm for visual place recognition. Image and Vision Computing, February 2010. [ bib | DOI | .pdf | Abstract ]
[988] A. Roy and S. Marcel. Visual processing-inspired fern-audio features for noise-robust speaker verification. In ACM 25th Symposium on Applied Computing, 2010, Sierre, Switzerland. Association for Computing Machinery, March 2010. [ bib | .pdf | Abstract ]
[989] G. Garau and H. Bourlard. Using audio and visual cues for speaker diarisation initialisation. In International Conference on Acoustics, Speech and Signal Processing, March 2010. [ bib | .pdf ]
[990] A. Roy, M. Magimai-Doss, and S. Marcel. Boosted binary features for noise-robust speaker verification. In 2010 IEEE International Conference on Acoustics, Speech and Signal Processing, March 2010. [ bib | .pdf | Abstract ]
[991] D. Korchagin, P. N. Garner, and J. Dines. Automatic temporal alignment of av data with confidence estimation. In Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing [867]. [ bib | .pdf | Abstract ]
[992] Gokul Chittaranjan and H. Hung. Are you a werewolf? detecting deceptive roles and outcomes in a conversational role-playing game. In IEEE International Conference on Acoustics, Speech and Signal Processing, March 2010. [ bib | .pdf | Abstract ]
[993] D. Imseng and G. Friedland. An adaptive initialization method for speaker diarization based on prosodic features. In Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing, pages 4946–4949, March 2010. [ bib | .pdf | Abstract ]
[994] H. Liang, J. Dines, and L. Saheer. A comparison of supervised and unsupervised cross-lingual speaker adaptation approaches for hmm-based speech synthesis. In Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, pages 4598–4601, March 2010. [ bib | .pdf | Abstract ]
[995] J. Luo, F. Orabona, Marco Fornoni, B. Caputo, and Nicolo Cesa-Bianchi. Om-2: An online multi-class multi-kernel learning algorithm. Idiap-RR Idiap-RR-06-2010, Idiap, April 2010. [ bib | .pdf | Abstract ]
[996] A. Roy and S. Marcel. Crossmodal matching of speakers using lip and voice features in temporally non-overlapping audio and video streams. In 20th International Conference on Pattern Recognition, Istanbul, Turkey. International Association for Pattern Recognition (IAPR), April 2010. [ bib | .pdf | Abstract ]
[997] P. Motlicek and F. Valente. Application of out-of-language detection to spoken-term detection. In 2010 IEEE International Conference on Acoustics, Speech and Signal Processing, April 2010. [ bib | .pdf | Abstract ]
[998] P. N. Garner and J. Dines. Tracter: A lightweight dataflow framework. Idiap-RR Idiap-RR-10-2010, Idiap, May 2010. [ bib | .pdf | Abstract ]
[999] Joan-Isaac Biel and Daniel Gatica-Perez. Voices of vlogging. In Proc. AAAI Int. Conf. on Weblogs and Social Media (ICWSM), Washington DC, May 2010. [ bib | .pdf | Abstract ]
[1000] Harry Bunt, Jan Alexandersson, J. Carletta, Jae-Woong Choe, Alex Fang, Koiti Hasida, Kiyong Lee, Volha Petukhova, A. Popescu-Belis, Laurent Romary, Claudia Soria, and Traum. David. Towards a standard for dialogue act annotation. In 7th International Conference on Language Resources and Evaluation, May 2010. [ bib | .html | .pdf | Abstract ]
[1001] Trinh-Minh-Tri Do and Thierry Artieres. Neural conditional random fields. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, volume 9, pages 177–184. JMLR: W&CP, May 2010. [ bib | .pdf | Abstract ]
[1002] S. Marcel, C. McCool, Pavel Matejka, Timo Ahonen, and Jan Cernocky. Mobile biometry (mobio) face and speaker verification evaluation. Idiap-RR Idiap-RR-09-2010, Idiap, rue Marconi 19, May 2010. [ bib | .pdf | Abstract ]
[1003] Oya Aran and Lale Akarun. A multi-class classification strategy for fisher scores: Application to signer independent sign language recognition. Pattern Recognition, 43(5):1776–1788, May 2010. [ bib | DOI | .pdf | Abstract ]
[1004] F. Orabona, J. Luo, and B. Caputo. Online-batch strongly convex multi kernel learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, June 2010. [ bib | .pdf | Abstract ]
[1005] Afsaneh Asaei, H. Bourlard, and B. Picart. Investigation of knn classifier on posterior features towards application in automatic speech recognition. Idiap-RR Idiap-RR-11-2010, Idiap, June 2010. [ bib | .pdf | Abstract ]
[1006] H. Hung and D. Gatica-Perez. Estimating cohesion in small groups using audio-visual nonverbal behavior. Idiap-RR Idiap-RR-12-2010, Idiap, June 2010. [ bib | .pdf | Abstract ]
[1007] A. Popescu-Belis, J. Kilgour, A. Nanchen, and P. Poller. The acld: Speech-based just-in-time retrieval of multimedia documents and websites. Idiap-RR Idiap-RR-26-2010, Idiap, July 2010. [ bib | .pdf | Abstract ]
[1008] N. Kiukkonen, Blom J., O. Dousse, Daniel Gatica-Perez, and Laurila J. Towards rich mobile phone datasets: Lausanne data collection campaign. In Proc. ACM Int. Conf. on Pervasive Services (ICPS), Berlin., July 2010. [ bib | .pdf ]
[1009] L. Saheer, P. N. Garner, and J. Dines. Study of jacobian normalization for vtln. Idiap-RR Idiap-RR-25-2010, Idiap, July 2010. [ bib | .pdf | Abstract ]
[1010] R. A. Negoescu and D. Gatica-Perez. Modeling and understanding flickr communities through topic-based analysis. Idiap-RR Idiap-RR-19-2010, Idiap, July 2010. [ bib | .pdf ]
[1011] R. A. Negoescu, Alexander Loui, and D. Gatica-Perez. Kodak moments and flickr diamonds: How users shape large-scale media. Idiap-RR Idiap-RR-20-2010, Idiap, July 2010. [ bib | .pdf | Abstract ]
[1012] R. A. Negoescu and D. Gatica-Perez. Flickr groups: Multimedia communities for multimedia analysis. Idiap-RR Idiap-RR-18-2010, Idiap, July 2010. [ bib | .pdf ]
[1013] D. Vijayasenan, F. Valente, and H. Bourlard. An information theoretic combination of mfcc and tdoa features for speaker diarization. Idiap-RR Idiap-RR-22-2010, Idiap, July 2010. [ bib | .pdf | Abstract ]
[1014] D. Vijayasenan, F. Valente, and H. Bourlard. Advances in fast multistream diarization based on the information bottleneck framework. Idiap-RR Idiap-RR-23-2010, Idiap, July 2010. [ bib | .pdf | Abstract ]
[1015] K. Farrahi and D. Gatica-Perez. Probabilistic mining of socio-geographic routines from mobile phone data. IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 4(4):746–755, August 2010. [ bib | .pdf | Abstract ]
[1016] S. Marcel, C. McCool, Pavel Matejka, Timo Ahonen, Jan Cernocky, and al. On the results of the first mobile biometry (mobio) face and speaker verification evaluation. Idiap-RR Idiap-RR-30-2010, Idiap, August 2010. [ bib | .pdf | Abstract ]
[1017] R. A. Negoescu and D. Gatica-Perez. Modeling and understanding flickr communities through topic-based analysis. IEEE Transactions on Multimedia, 12(5):399–416, August 2010. [ bib | DOI | Abstract ]
[1018] K. Farrahi and D. Gatica-Perez. Mining human location-routines using a multi-level topic model. Idiap-RR Idiap-RR-28-2010, Idiap, August 2010. [ bib | .pdf ]
[1019] S. Marcel, C. McCool, Cosmin Atanasoaei, Flavio Tarsetti, Jan Pesan, Pavel Matejka, Jan Cernocky, Mika Helistekangas, and Markus Turtinen. Mobio: Mobile biometric face and speaker authentication. Idiap-RR Idiap-RR-31-2010, Idiap, rue Marconi 19, August 2010. [ bib | .pdf | Abstract ]
[1020] Oya Aran and D. Gatica-Perez. Fusing audio-visual nonverbal cues to detect dominant people in conversations. In 20th International Conference on Pattern Recognition, Istanbul, Turkey, 2010, August 2010. [ bib | .pdf | Abstract ]
[1021] P. N. Garner and J. Dines. Tracter: A lightweight dataflow framework. In Proceedings of Interspeech [998]. [ bib | .pdf | Abstract ]
[1022] D. Imseng, H. Bourlard, and M. Magimai-Doss. Towards mixed language speech recognition systems. In Proceedings of Interspeech, September 2010. [ bib | .pdf | Abstract ]
[1023] T. Hain, Lukas Burget, J. Dines, P. N. Garner, A. El Hannani, M. Huijbregts, M. Karafiat, M. Lincoln, and V. Wan. The amida 2009 meeting transcription system. In Proceedings of Interspeech, September 2010. [ bib | .pdf | Abstract ]
[1024] Mirjam Wester, J. Dines, Matthew Gibson, H. Liang, Yi-Jian Wu, L. Saheer, S. King, Keiichiro Oura, P. N. Garner, William Byrne, Yong Guan, Teemu Hirsimüaki, Reima Karhila, Mikko Kurimo, Matt Shannon, Sayaka Shiota, Jilei Tian, Keiichi Tokuda, and J. Yamagishi. Speaker adaptation and the evaluation of speaker similarity in the emime speech-to-speech translation project. In Proceedings of the 7th ISCA Speech Synthesis Workshop, September 2010. [ bib | .pdf | Abstract ]
[1025] Afsaneh Asaei, H. Bourlard, and P. N. Garner. Sparse component analysis for speech recognition in multi-speaker environment. In Proceedings of Interspeech, September 2010. [ bib | .pdf | Abstract ]
[1026] Jagannadan Varadarajan, Remi Emonet, and J. M. Odobez. Probabilistic latent sequential motifs: Discovering temporal activity patterns in video scenes. In BMVC 2010 [875], pages 117.1–117.11. [ bib | .pdf | Abstract ]
[1027] L. Saheer, J. Dines, P. N. Garner, and H. Liang. Implementation of vtln for statistical speech synthesis. In Proceedings of ISCA Speech Synthesis Workshop [874]. [ bib | .pdf | Abstract ]
[1028] D. Imseng, M. Magimai-Doss, and H. Bourlard. Hierarchical multilayer perceptron based language identification. In Proceedings of Interspeech, September 2010. [ bib | .pdf | Abstract ]
[1029] D. Korchagin, P. N. Garner, and P. Motlicek. Hands free audio analysis from home entertainment. In Proceedings of Interspeech, September 2010. [ bib | .pdf | Abstract ]
[1030] Alfred Dielmann, Giulia Garau, and Hervé Bourlard. Floor holder detection and end of speaker turn prediction in meetings. In International Conference on Speech and Language Processing, Interspeech. ISCA, September 2010. [ bib | .pdf | Abstract ]
[1031] P. Motlicek, F. Valente, and P. N. Garner. English spoken term detection in multilingual recordings. In Proceedings of Interspeech, Makuhari, Japan, 2010. ISCA, September 2010. [ bib | .pdf | Abstract ]
[1032] G. Garau, A. Dielmann, and H. Bourlard. Audio-visual synchronisation for speaker diarisation. In International Conference on Speech and Language Processing, Interspeech, September 2010. [ bib | .pdf | Abstract ]
[1033] H. Liang and John Dines. An analysis of language mismatch in hmm state mapping-based cross-lingual speaker adaptation. In Proceedings of Interspeech, September 2010. [ bib | .pdf | Abstract ]
[1034] A. Popescu-Belis, J. Kilgour, A. Nanchen, and P. Poller. The acld: Speech-based just-in-time retrieval of meeting transcripts, documents and websites. In ACM Multimedia Workshop on Searching Spontaneous Conversational Speech [1007]. [ bib | .pdf | Abstract ]
[1035] H. Hung and Gokul Chittaranjan. The wolf corpus: Exploring group behaviour in a competitive role-playing game. In ACM Multimedia, October 2010. [ bib | .pdf | Abstract ]
[1036] Andrei Popescu-Belis, Majid Yazdani, Alexandre Nanchen, and Philip N. Garner. A speech-based just-in-time retrieval system using semantic search. In Proceedings of the ACL-HLT 2011 System Demonstrations (49th Annual Meeting of the Association for Computational Linguistics), pages 80–86, Portland, OR, 2011. [ bib ]
[1037] Steve Whittaker, Simon Tucker, and Denis Lalanne. Meeting browsers and meeting assistants. Cambridge University Press, Cambridge, UK, 2011. [ bib ]
[1038] J. S. Lee, F. De Simone, and T. Ebrahimi. Efficient Video Coding based on Audio-Visual Focus of Attention. Journal of Visual Communication and Image Representation, 22(8):704–711, 2011. [ bib ]
[1039] F. De Simone, L. Goldmann, J. S. Lee, and T. Ebrahimi. Towards High Efficiency Video Coding: Subjective Evaluation of Potential Coding Technologies. Journal of Visual Communication and Image Representation, 22(8):734�–748, 2011. [ bib ]
[1040] J. S. Lee, F. De Simone, and T. Ebrahimi. Subjective Quality Evaluation via Paired Comparison: Application to Scalable Video Coding. IEEE Transactions on Multimedia, 13(5):882–893, 2011. [ bib ]
[1041] J. S. Lee, F. De Simone, and T. Ebrahimi. Subjective Quality Evaluation of Foveated Video Coding Using Audio-Visual Focus of Attention. IEEE Journal of Selected Topics in Signal Processing, 5(7):1322–1331, 2011. [ bib ]
[1042] Jean-Louis Durrieu and Jean-Philippe Thiran. SPARSE NON-NEGATIVE DECOMPOSITION OF SPEECH POWER SPECTRA FOR FORMANT TRACKING. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, International Conference on Acoustics Speech and Signal Processi. Ieee Service Center, 445 Hoes Lane, Po Box 1331, Piscataway, Nj 08855-1331 Usa, 2011. [ bib | .pdf | Abstract ]
[1043] Fabio Valente. Data-driven extraction of spectral-dynamics based posteriors. In Handbook of Natural Language Processing and Machine Translation Handbook of Natural Language Processing and Machine Translation. Springer, 2011. [ bib | http ]
[1044] Hugues Salamin and Alessandro Vinciarelli. Introduction to sequence analysis for human behavior understanding. In "Computer Analysis of Human Behavior" by A.Salah and T.Gevers (eds.), pages 21–40. Springer Verlag, 2011. [ bib ]
[1045] Maja Pantic, R. Cowie, F. D'Errico, Dirk Heylen, M. Mehu, C. Pelachaud, I. Poggi, M. Schroeder, and Alessandro Vinciarelli. Social signal processing: The research agenda. In "Visual Analysis of Humans" by T.B.Moeslund, A.Hilton, V.Krueger and L.Sigal (eds.), pages 511–538. Springer Verlag, 2011. [ bib ]
[1046] Radu-Andrei Negoescu and Daniel Gatica-Perez. Flickr groups: Multimedia communities for multimedia analysis. In Xian-Sheng Hua, Marcel Worring, and Tat-Seng Chua, editors, Internet Multimedia Search and Mining. Bentham Science Publishers, 2011. [ bib | Abstract ]
[1047] Cem Keskin, Oya Aran, and Lale Akarun. Hand gesture analysis. In Albert Ali Salah and Theo Gevers, editors, Computer Analysis of Human Behavior,, pages 125–149. Springer London, 2011. [ bib ]
[1048] Cong-Thanh Do. Acoustic simulations of cochlear implants in human and machine hearing researchs. In Cochlear Implantation. InTech Publisher, 2011. [ bib | .pdf ]
[1049] Oya Aran and Daniel Gatica-Perez. Analysis of group conversations: Modeling social verticality. In Albert Ali Salah and Theo Gevers, editors, Computer Analysis of Human Behavior, pages 293–322. Springer London, 2011. [ bib ]
[1050] A. Esposito, Alessandro Vinciarelli, K. Vicsi,  Pelachaud, C, and A. Nijholt. Analysis of Verbal and Nonverbal Communication and Enactment: The Processing Issues. Springer Verlag, 2011. [ bib ]
[1051] Joan-Isaac Biel, Oya Aran, and Daniel Gatica-Perez. You are known by how you vlog: Personality impressions and nonverbal behavior in youtube. In Proceedings of AAAI International Conference on Weblogs and Social Media, 2011. [ bib | .pdf | Abstract ]
[1052] Riwal Lefort, L. Fusco, F. Benmansour, Kevin C. Smith, O. Pertz, and Francois Fleuret. Machine learning techniques to analyse complex, computer vision-extracted, dynamic cellular phenotypes. In 1st International SystemsX.ch Conference on Systems Biology, 2011. [ bib ]
[1053] Joel Pinto, Mathew Magimai.-Doss, and Hervé Bourlard. Hierarchical tandem features for asr in mandarin. In Proceedings of Interspeech, 2011. [ bib ]
[1054] Jesus Martinez-Gomez and Barbara Caputo. Towards semi-supervised learning of semantic spatial concepts. In IEEE International Conference on Robotics and Automation, 2011. [ bib | .pdf | Abstract ]
[1055] Albert Ali Salah, Maja Pantic, and Alessandro Vinciarelli. Recent developments in social signal processing. In Proceedings of the IEEE International Conference on Systems, Man and Cybernetics, pages 380–385, 2011. [ bib ]
[1056] Anindya Roy, Mathew Magimai.-Doss, and Sébastien Marcel. Fast speaker verification on mobile phone data using boosted slice classifiers. In IAPR IEEE International Joint Conference on Biometrics, 2011. [ bib | .pdf | Abstract ]
[1057] Fabio Valente and Alessandro Vinciarelli. Language-independent socio-emotional role recognition in the ami meetings corpus. In Proceedings of Interspeech, 2011. [ bib | .pdf ]
[1058] Fabio Valente, Alessandro Vinciarelli, Sree Harsha Yella, and A. Sapru. Understanding social signals in multi-party conversations: Automatic recognition of socio-emotional roles in the ami meeting corpus. In Proceedings of the IEEE International Conference on Systems, Man and Cybernetics, pages 374–379, 2011. [ bib ]
[1059] Fabio Valente, Deepu Vijayasenan, and Petr Motlicek. Speaker diarization of meetings based on speaker role n-gram models. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2011. [ bib | .pdf ]
[1060] Fabian Nater, Tatiana Tommasi, Helmut Grabner, Luc Van Gool, and Barbara Caputo. Transferring activities: Updating human behavior analysis. In Visual Surveillance Workshop at ICCV, 2011. [ bib | .pdf ]
[1061] Roy Wallace, Mitchell McLaren, Chris McCool, and Sébastien Marcel. Inter-session variability modelling and joint factor analysis for face authentication. In International Joint Conference on Biometrics, 2011. [ bib ]
[1062] Antoine Vinel, Trinh-Minh-Tri Do, and Thierry Artieres. Joint optimization of hidden conditional random fields and non linear feature extraction. In Proceedings of International Conference on Document Analysis and Recognition, 2011. [ bib ]
[1063] Deepu Vijayasenan, Fabio Valente, and Petr Motlicek. Multistream speaker diarization through information bottleneck system outputs combination. In Proceedings of International Conference on Acoustics, Speech and Signal Processing, 2011. [ bib | .pdf ]
[1064] Fabio Valente, Mathew Magimai.-Doss, and Wen Wang. Analysis and comparison of recent mlp features for lvcsr systems. In Proceedings of Interspeech 2011, 2011. [ bib | .pdf ]
[1065] Sree Harsha Yella and Fabio Valente. Information bottleneck features for hmm/gmm speaker diarization of meetings recordings. In Interspeech, pages 953–956, 2011. [ bib | .pdf | Abstract ]
[1066] Danil Korchagin. Automatic time skew detection and correction. In Proceedings International Conference on Signal Acquisition and Processing, Martigny, Switzerland, 2011. [ bib | .pdf | Abstract ]
[1067] David Klotz, Johannes Wienke, Julia Peltason, Britta Wrede, Sebastian Wrede, Vasil Khalidov, and Jean-Marc Odobez. Engagement-based multi-party dialog with a humanoid robot. In Proceedings of the SIGDIAL 2011: the 12th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 341–343, 2011. [ bib | .pdf | Abstract ]
[1068] Alexandre Heili, Cheng Chen, and Jean-Marc Odobez. Detection-based multi-human tracking using a crf model. In The Eleventh IEEE International Workshop on Visual Surveillance, 2011. [ bib | .pdf ]
[1069] German Gonzalez, L. Fusco, Riwal Lefort, F. Benmansour, Pascal Fua, and Kevin C. Smith. Automated quantification of morphodynamics for high-throughput live cell imaging datasets. In 1st International SystemsX.ch Conference on Systems Biology, 2011. [ bib ]
[1070] L. Fusco, Kevin C. Smith, F. Benmansour, Riwal Lefort, Francois Fleuret, Pascal Fua, and O. Pertz. Morphodynamic profiling to explore spatio-temporal signaling networks regulating neurite outgrowth. In 1st International SystemsX.ch Conference on Systems Biology, 2011. [ bib ]
[1071] Francois Fleuret, Philip Abbet, Charles Dubout, and Leonidas Lefakis. The mash project. In European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2011. [ bib | .pdf ]
[1072] Stefan Duffner, Petr Motlicek, and Danil Korchagin. The ta2 database - a multi-modal database from home entertainment. In International Conference on Signal Acquisition and Processing, 2011. [ bib | .pdf | Abstract ]
[1073] Charles Dubout and Francois Fleuret. Boosting with maximum adaptive sampling. In Proceedings of the Neural Information Processing Systems Conference, 2011. [ bib ]
[1074] M. Cristani, A. Pesarin, Alessandro Vinciarelli, M. Crocco, and V. Murino. Look at who's talking. In Proceedings of International Conference on Ambient Intelligence, pages 68–76, 2011. [ bib ]
[1075] M. Cristani, G. Paggetti, Alessandro Vinciarelli, L. Bazzani, G. Menegaz, and V. Murino. Towards computational proxemics: Inferring social relations from interpersonal distances. In Proceedings of the IEEE International Conference on Social Computing, pages 290–297, 2011. [ bib ]
[1076] Ronan Collobert, Koray Kavukcuoglu, and Clément Farabet. Torch7: A matlab-like environment for machine learning. In BigLearn, NIPS Workshop, 2011. [ bib | .pdf ]
[1077] Ronan Collobert. Deep learning for efficient discriminative parsing. In International Conference on Artificial Intelligence and Statistics, 2011. [ bib | .pdf ]
[1078] Cheng Chen, Alexandre Heili, and Jean-Marc Odobez. A joint estimation of head and body orientation cues in surveillance video. In IEEE International Workshop on Socially Intelligent Surveillance and Monitoring, 2011. [ bib | .pdf ]
[1079] Antoine Bordes, Jason Weston, Ronan Collobert, and Yoshua Bengio. Learning structured embeddings of knowledge bases. In Conference on Artificial Intelligence, 2011. [ bib | .pdf ]
[1080] J. Blom, Daniel Gatica-Perez, and N. Kiukkonen. People-centric mobile sensing with a pragmatic twist: from behavioral data points to active user involvement. In International Conference on Human-Computer Interaction with Mobile Devices and Services, 2011. [ bib | .pdf | Abstract ]
[1081] Xavier Alameda-Pineda, Vasil Khalidov, Radu Horaud, and Florence Forbes. Finding audio-visual events in informal social gatherings. In IEEE/ACM 13th International Conference on Multimodal Interaction, 2011. Oustanding paper award. [ bib | .pdf | Abstract ]
[1082] Deepu Vijayasenan, Fabio Valente, and Hervé Bourlard. An information theoretic combination of mfcc and tdoa features for speaker diarization. IEEE Transactions on Audio Speech and Language Processing, 19(2), 2011. [ bib | DOI | Abstract ]
[1083] Fabio Valente, Mathew Magimai.-Doss, Christian Plahl, Suman Ravuri, and Wen Wang. Transcribing mandarin broadcast speech using multi-layer perceptron acoustic features. IEEE Transactions on Audio, Speech, and Language Processing, 19(8), 2011. [ bib | DOI ]
[1084] Tatiana Tommasi, Francesco Orabona, Claudio Castellini, and Barbara Caputo. Improving control of dexterous hand prostheses using adaptive learning. IEEE TRANSACTIONS OF ROBOTICS, 2011. [ bib | Abstract ]
[1085] Dairazalia Sanchez-Cortes, Oya Aran, Marianne Schmid Mast, and Daniel Gatica-Perez. A nonverbal behavior approach to identify emergent leaders in small groups. IEEE Transactions on Multimedia, 2011. [ bib | DOI | .pdf ]
[1086] Anindya Roy, Mathew Magimai.-Doss, and Sébastien Marcel. A fast parts-based approach to speaker verification using boosted slice classifiers. IEEE Transactions on Information Forensics and Security, 2011. [ bib | .pdf | Abstract ]
[1087] Andrei Popescu-Belis, Denis Lalanne, and Hervé Bourlard. Finding information in multimedia records of meetings. IEEE Multimedia, 2011. In press. [ bib | DOI | http | Abstract ]
[1088] Sree Hari Krishnan Parthasarathi, Padmanabhan Rajan, and Hema A Murthy. Robustness of group delay representations for noisy speech signals. IJST (Springer), 14(4), 2011. [ bib | .pdf | Abstract ]
[1089] Jesus Martinez-Gomez and Barbara Caputo. Towards semi-supervised learning of semantic spatial concepts for mobile robots. Journal of Physical Agents, 2011. [ bib | .pdf ]
[1090] Anmol Madan, Manuel Cebrian, Sai Moturu, Katayoun Farrahi, and Alex Pentland. Sensing the `health state` of our society. IEEE Pervasive Computing, Special Issue on Large-Scale Opportunistic Sensing, 2011. [ bib | .pdf ]
[1091] Francois Fleuret, Ting Li, Charles Dubout, Emma K. Wampler, Steven Yantis, and Donald Geman. Comparing machines and humans on a visual categorization test. Proceedings of the National Academy of Sciences, 2011. [ bib ]
[1092] Katayoun Farrahi and Daniel Gatica-Perez. Discovering routines from large-scale human locations using probabilistic topic models. ACM Transactions on Intelligent Systems and Technology, 2(1), 2011. [ bib | .pdf | Abstract ]
[1093] Trinh-Minh-Tri Do and Thierry Artieres. Non-convex regularized bundle method. Journal of Machine Learning Research, 2011. [ bib | Abstract ]
[1094] John Dines, Hui Liang, Lakshmi Saheer, Matthew Gibson, William Byrne, Keiichiro Oura, Keiichi Tokuda, Junichi Yamagishi, Simon King, Mirjam Wester, Teemu Hirsimüaki, Reima Karhila, and Mikko Kurimo. Personalising speech-to-speech translation: Unsupervised cross-lingual speaker adaptation for hmm-based speech synthesis. Computer Speech and Language, 2011. [ bib | DOI | http | .pdf | Abstract ]
[1095] Ronan Collobert, Jason Weston, Léon Bottou, Michael Karlen, Koray Kavukcuoglu, and Pavel Kuksa. Natural language processing (almost) from scratch. Journal of Machine Learning Research, 12:2493–2537, 2011. [ bib | .pdf ]
[1096] Claudio Castellini, Tatiana Tommasi, Nicoletta Noceti, Francesca Odone, and Barbara Caputo. Using object affordances to improve object recognition. IEEE Transaction on Autonomous Mental Development, 2011. [ bib | .pdf ]
[1097] Joan-Isaac Biel and Daniel Gatica-Perez. Vlogsense: Conversational behavior and social attention in youtube. Transactions on Multimedia Computing, Communications and Applications, 2011. [ bib | .pdf | Abstract ]
[1098] Roy Wallace, Mitchell McLaren, Chris McCool, and Sébastien Marcel. Inter-session variability modelling and joint factor analysis for face authentication. Idiap-RR Idiap-RR-28-2011, Idiap, 2011. [ bib | .pdf | Abstract ]
[1099] Mohammad J. Taghizadeh, Philip N. Garner, and Hervé Bourlard. Broadband beampattern for multi-channel speech acquisition and distant speech recognition. Idiap-RR Idiap-RR-39-2011, Idiap, 2011. [ bib | .pdf | Abstract ]
[1100] Nicolae Suditu and Francois Fleuret. Heat: Iterative relevance feedback with one million images. Idiap-RR Idiap-RR-33-2011, Idiap, 2011. [ bib | .pdf | Abstract ]
[1101] Anindya Roy, Mathew Magimai.-Doss, and Sébastien Marcel. Continuous speech recognition using boosted binary features. Idiap-RR Idiap-RR-35-2011, Idiap, 2011. [ bib | .pdf | Abstract ]
[1102] Ramya Rasipuram and Mathew Magimai.-Doss. Acoustic data-driven grapheme-to-phoneme conversion using kl-hmm. Idiap-RR Idiap-RR-38-2011, Idiap, 2011. [ bib | .pdf | Abstract ]
[1103] Sree Hari Krishnan Parthasarathi, Padmanabhan Rajan, and Hema A Murthy. Robustness of group delay representations for noisy speech signals. Idiap-RR Idiap-RR-36-2011, Idiap, 2011. [ bib | .pdf ]
[1104] Mert Ozcan, Jie Luo, Vittorio Ferrari, and Barbara Caputo. A large-scale database of images and captions for automatic face naming. Idiap-RR Idiap-RR-26-2011, Idiap, 2011. [ bib | .pdf | Abstract ]
[1105] Francesco Orabona and Jie Luo. Ultra-fast optimization algorithm for sparse multi kernel learning. Idiap-RR Idiap-RR-11-2011, Idiap, 2011. [ bib | .pdf | Abstract ]
[1106] Chris McCool and Sébastien Marcel. Parts-based face verification using local frequency bands. Idiap-RR Idiap-RR-06-2011, Idiap, 2011. [ bib | .pdf | Abstract ]
[1107] Jesus Martinez-Gomez and Barbara Caputo. Towards semi-supervised learning of semantic spatial concepts. Idiap-RR Idiap-RR-03-2011, Idiap, 2011. [ bib | .pdf | Abstract ]
[1108] Jie Luo, Francesco Orabona, Barbara Caputo, and Vittorio Ferrari. Learning from images with captions using the maximum margin set algorithm. Idiap-RR Idiap-RR-30-2011, Idiap, 2011. [ bib | .pdf | Abstract ]
[1109] Jie Luo and Francesco Orabona. Learning from candidate labeling sets. Idiap-RR Idiap-RR-27-2011, Idiap, 2011. [ bib | .pdf | Abstract ]
[1110] Jie Luo, Tatiana Tommasi, and Barbara Caputo. Multiclass transfer learning from unconstrained priors. Idiap-RR Idiap-RR-25-2011, Idiap, 2011. [ bib | .pdf | Abstract ]
[1111] Danil Korchagin, Stefan Duffner, Petr Motlicek, and Carl Scheffler. Multimodal cue detection engine for orchestrated entertainment. Idiap-RR Idiap-RR-34-2011, Idiap, Martigny, Switzerland, 2011. [ bib | .pdf | Abstract ]
[1112] Danil Korchagin and Hamid Reza Abutalebi. Social focus of attention as a time function derived from multimodal signals. Idiap-RR Idiap-RR-09-2011, Idiap, Martigny, Switzerland, 2011. [ bib | .pdf | Abstract ]
[1113] Danil Korchagin. Audio spatio-temporal fingerprints for cloudless real-time hands-free diarization on mobile devices. Idiap-RR Idiap-RR-08-2011, Idiap, Martigny, Switzerland, 2011. [ bib | .pdf | Abstract ]
[1114] Niklas Johansson, Chris McCool, and Sébastien Marcel. On-line unsupervised adaptation for face verification using gaussian mixture models with multiple user models. Idiap-RR Idiap-RR-07-2011, Idiap, 2011. [ bib | .pdf | Abstract ]
[1115] Laurent El Shafey, Roy Wallace, and Sébastien Marcel. Face verification using gabor filtering and adapted gaussian mixture models. Idiap-RR Idiap-RR-37-2011, Idiap, 2011. [ bib | .pdf | Abstract ]
[1116] Cong-Thanh Do, Mohammad J. Taghizadeh, and Philip N. Garner. Improving microphone array speech recognition with cochlear implant-like spectrally reduced speech. Idiap-RR Idiap-RR-40-2011, Idiap, 2011. [ bib | .pdf | Abstract ]
[1117] Murali Mohan Chakka, André Anjos, and Sébastien Marcel. Competition on counter measures to 2-d facial spoofing attacks. Idiap-RR Idiap-RR-29-2011, Idiap, 2011. International Joint Conference on Biometrics (IJCB) 2011. [ bib | .pdf | Abstract ]
[1118] Hamid Reza Abutalebi, Mehdi Rashidinejad, Hervé Bourlard, and Ali Akbar Tadaion. Speech enhancement using beta-order mmse spectral amplitude estimator with laplacian prior. Idiap-RR Idiap-RR-24-2011, Idiap, 2011. [ bib | .pdf | Abstract ]
[1119] Andrei Popescu-Belis, Denis Lalanne, and Hervé Bourlard. When users meet technology: The meeting browser development helix. Idiap-RR Idiap-RR-05-2011, Idiap, 2011. [ bib | .pdf ]
[1120] Andrei Popescu-Belis, Majid Yazdani, Alexandre Nanchen, and Philip N. Garner. A speech-based just-in-time retrieval system using semantic search. Idiap-RR Idiap-RR-31-2011, Idiap, Portland, OR, 2011. [ bib | .pdf ]
[1121] Andrei Popescu-Belis, Denis Lalanne, and Hervé Bourlard. Finding information in multimedia records of meetings. Idiap-RR Idiap-RR-32-2011, Idiap, 2011. [ bib | .pdf | Abstract ]
[1122] Stefan Duffner and Jean-Marc Odobez. Exploiting long-term observations for track creation and deletion in online multi-face tracking. Idiap-RR Idiap-RR-01-2011, Idiap, Rue Marconi 19, CH-1920 Martigny, 2011. [ bib | .pdf | Abstract ]
[1123] Ramya Rasipuram and Mathew Magimai.-Doss. Integrating articulatory features using kullback-leibler divergence based acoustic model for phoneme recognition. Idiap-RR Idiap-RR-02-2011, Idiap, 2011. [ bib | .pdf | Abstract ]
[1124] Afsaneh Asaei, Hervé Bourlard, and Volkan Cevher. Model-based compressive sensing for multi-party distant speech recognition. Idiap-RR Idiap-RR-04-2011, Idiap, 2011. [ bib | Abstract ]
[1125] Danil Korchagin, Petr Motlicek, Stefan Duffner, and Hervé Bourlard. Just-in-time multimodal association and fusion from home entertainment. Idiap-RR Idiap-RR-10-2011, Idiap, Martigny, Switzerland, 2011. [ bib | .pdf | Abstract ]
[1126] Sree Hari Krishnan Parthasarathi, Daniel Gatica-Perez, Hervé Bourlard, and Mathew Magimai.-Doss. Privacy-sensitive audio features for speech/nonspeech detection. Idiap-RR Idiap-RR-12-2011, Idiap, 2011. [ bib | .pdf | Abstract ]
[1127] David Imseng, Hervé Bourlard, Mathew Magimai.-Doss, and John Dines. Language dependent universal phoneme posterior estimation for mixed language speech recognition. Idiap-RR Idiap-RR-13-2011, Idiap, 2011. [ bib | .pdf | Abstract ]
[1128] Sree Hari Krishnan Parthasarathi, Hervé Bourlard, and Daniel Gatica-Perez. Lp residual features for robust, privacy-sensitive speaker diarization. Idiap-RR Idiap-RR-14-2011, Idiap, 2011. [ bib | .pdf | Abstract ]
[1129] Philip N. Garner. Cepstral normalisation and the signal to noise ratio spectrum in automatic speech recognition. Idiap-RR Idiap-RR-15-2011, Idiap, 2011. [ bib | .pdf | Abstract ]
[1130] Mohammad J. Taghizadeh, Philip N. Garner, Hervé Bourlard, Hamid Reza Abutalebi, and Afsaneh Asaei. An integrated framework for multi-channel multi-source localization and voice activity detection. Idiap-RR Idiap-RR-16-2011, Idiap, 2011. [ bib | .pdf | Abstract ]
[1131] Hui Liang and John Dines. Phonological knowledge guided hmm state mapping for cross-lingual speaker adaptation. Idiap-RR Idiap-RR-17-2011, Idiap, 2011. [ bib | .pdf | Abstract ]
[1132] Mirjam Wester and Hui Liang. Cross-lingual speaker discrimination using natural and synthetic speech. Idiap-RR Idiap-RR-18-2011, Idiap, 2011. [ bib | .pdf | Abstract ]
[1133] David Imseng, Hervé Bourlard, John Dines, Philip N. Garner, and Mathew Magimai.-Doss. Improving non-native asr through stochastic multilingual phoneme space transformations. Idiap-RR Idiap-RR-19-2011, Idiap, 2011. [ bib | .pdf | Abstract ]
[1134] Danil Korchagin. Impact of excitation frequency on short-term recording synchronisation and confidence estimation. Idiap-RR Idiap-RR-20-2011, Idiap, Martigny, Switzerland, 2011. [ bib | .pdf | Abstract ]
[1135] Ramya Rasipuram and Mathew Magimai.-Doss. Multitask learning to improve articulatory feature estimation and phoneme recognition. Idiap-RR Idiap-RR-21-2011, Idiap, 2011. [ bib | .pdf | Abstract ]
[1136] Afsaneh Asaei, Mohammad J. Taghizadeh, Hervé Bourlard, and Volkan Cevher. Multi-party speech recovery exploiting structured sparsity models. Idiap-RR Idiap-RR-22-2011, Idiap, 2011. [ bib | Abstract ]
[1137] Georgios Skoumas and Philip N. Garner. Intuitive recipes for uncertainty decoding with snr features for noise robust asr. Idiap-RR Idiap-RR-23-2011, Idiap, 2011. [ bib | .pdf ]
[1138] Joan-Isaac Biel, Oya Aran, and Daniel Gatica-Perez. You are known by how you vlog: Personality impressions and nonverbal behavior in youtube. In Proceedings of AAAI International Conference on Weblogs and Social Media, Barcelona, 2011. [ bib ]
[1139] Gokul Chittaranjan, J. Blom, and Daniel Gatica-Perez. Who's who with big-five: Analyzing and classifying personality traits with smartphones. In International Symposium on Wearable Computing, page 8, 2011. [ bib ]
[1140] Danil Korchagin and Hamid Reza Abutalebi. Social focus of attention as a time function derived from multimodal signals. In Proceedings IEEE International Conference on Multimedia & Expo, Barcelona, Spain, 2011. [ bib ]
[1141] Anmol Madan, Katayoun Farrahi, Daniel Gatica-Perez, and Alex Pentland. Pervasive sensing to model political opinions in face-to-face networks. In Pervasive, San Francisco, 2011. [ bib ]
[1142] Gokul Chittaranjan, Oya Aran, and Daniel Gatica-Perez. Inferring truth from multiple annotators for social interaction analysis. In Neural Information Processing Systems (NIPS) Workshop on Modeling Human Communication Dynamics (HCD), page 4, 2011. [ bib ]
[1143] Gelareh Mohammadi and Alessandro Vinciarelli. Humans as feature extractors: Combining prosody and personality perception for better speaking style recognition. In Proceeding of IEEE Int Conference on Systems, Man, and Cybernetics - Special Sessions, 2011. [ bib ]
[1144] Trinh-Minh-Tri Do and Daniel Gatica-Perez. Groupus: Smartphone proximity data and human interaction type mining. In 15th annual International Symposium on Wearable Computers, San Francisco, USA, 2011. [ bib ]
[1145] Gokul Chittaranjan, Oya Aran, and Daniel Gatica-Perez. Exploiting observers' judgements for nonverbal group interaction analysis. In IEEE Conference on Automatic Face and Gesture Recognition, page 6. IEEE, 2011. [ bib ]
[1146] Trinh-Minh-Tri Do and Daniel Gatica-Perez. Contextual grouping: discovering real-life interaction types from longitudinal bluetooth data. In 12th International Conference on Mobile Data Management, 2011. [ bib ]
[1147] Gelareh Mohammadi and Alessandro Vinciarelli. Automatic attribution of personality traits based on prosodic features. In Proceedings of ACM Multimedia 2011 workshop, 2011. [ bib ]
[1148] Andrei Popescu-Belis, Majid Yazdani, Alexandre Nanchen, and Philip N. Garner. A just-in-time document retrieval system for dialogues or monologues. In SIGDIAL 2011 (12th annual SIGDIAL Meeting on Discourse and Dialogue), Demonstration Session, pages 350–352, Portland, OR, 2011. [ bib ]
[1149] Mehdi Banitalebi Dehkordi, Hamid Reza Abutalebi, and Hossein Ghanei. A compressive sensing based compressed neural network for sound source localization. In Proceedings of International Symposium on Artificial Intelligence and Signal Processing, 2011. [ bib ]
[1150] Mert Ozcan, Jie Luo, Vittorio Ferrari, and Barbara Caputo. A large-scale database of images and captions for automatic face naming. In Proceedings of the 22nd British Machine Vision Conference, 2011. [ bib ]
[1151] Mohammad J. Taghizadeh, Philip N. Garner, Hervé Bourlard, Hamid Reza Abutalebi, and Afsaneh Asaei. An integrated framework for multi-channel multi-source localization and voice activity detection. In The Third Joint Workshop on Hands-free Speech Communication and Microphone Arrays, 2011. [ bib ]
[1152] Cheng Chen, Alexandre Heili, and Jean-Marc Odobez. Combined estimation of location and body pose in surveillance video. In AVSS, 2011. [ bib ]
[1153] Stefan Duffner and Jean-Marc Odobez. Exploiting long-term observations for track creation and deletion in online multi-face tracking. In IEEE Conference on Automatic Face and Gesture Recognition, Santa Barbara, USA, 2011. [ bib ]
[1154] Mirjam Wester and Hui Liang. Cross-lingual speaker discrimination using natural and synthetic speech. In Proceedings of Interspeech, Florence, Italy, 2011. [ bib ]
[1155] Mathew Magimai.-Doss, Ramya Rasipuram, Guillermo Aradilla, and Hervé Bourlard. Grapheme-based automatic speech recognition using kl-hmm. In Proceedings of Interspeech, 2011. [ bib ]
[1156] Karim Ali, David Hasler, and Francois Fleuret. Flowboost - appearance learning from sparsely annotated video. In Proceedings of the IEEE international conference on Computer Vision and Pattern Recognition, 2011. [ bib ]
[1157] Remi Emonet, Jagannadan Varadarajan, and Jean-Marc Odobez. Extracting and locating temporal motifs in video scenes using a hierarchical non parametric bayesian model. In IEEE Conference on Computer Vision and Pattern Recognition, 2011. [ bib ]
[1158] Nicolae Suditu and Francois Fleuret. Heat: Iterative relevance feedback with one million images. In International Conference on Computer Vision, 2011. [ bib ]
[1159] Majid Yazdani and Andrei Popescu-Belis. Using a wikipedia-based semantic relatedness measure for document clustering. In Graph-based Methods for Natural Language Processing, 2011. [ bib ]
[1160] Horesh Ben Shitrit, Jerome Berclaz, Francois Fleuret, and Pascal Fua. Tracking multiple objects under global appearance constraints. In Proceedings of the IEEE International Conference on Computer Vision, 2011. [ bib ]
[1161] Serena Soldo, Mathew Magimai.-Doss, Joel Praveen Pinto, and Hervé Bourlard. Posterior features for template-based asr. In Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, Prague, Czech Republic, 2011. [ bib ]
[1162] Hui Liang and John Dines. Phonological knowledge guided hmm state mapping for cross-lingual speaker adaptation. In Proceedings of Interspeech, Florence, Italy, 2011. [ bib ]
[1163] Anindya Roy, Mathew Magimai.-Doss, and Sébastien Marcel. Phoneme recognition using boosted binary features. In IEEE Intl. Conference on Acoustics, Speech and Signal Processing 2011, 2011. [ bib ]
[1164] Afsaneh Asaei, Mohammad J. Taghizadeh, Hervé Bourlard, and Volkan Cevher. Multi-party speech recovery exploiting structured sparsity models. In Proceedings of International Speech Communication Association, INTERSPEECH, 2011. [ bib ]
[1165] Afsaneh Asaei, Hervé Bourlard, and Volkan Cevher. Model-based compressive sensing for multi-party distant speech recognition. In 2011 IEEE International Conference on Acoustics, Speech and Signal Processing, 2011. [ bib ]
[1166] Sree Hari Krishnan Parthasarathi, Hervé Bourlard, and Daniel Gatica-Perez. Lp residual features for robust, privacy-sensitive speaker diarization. In Interspeech, 2011. [ bib ]
[1167] David Imseng, Hervé Bourlard, Mathew Magimai.-Doss, and John Dines. Language dependent universal phoneme posterior estimation for mixed language speech recognition. In Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing, pages 5012–5015, Prag, CZ, 2011. [ bib ]
[1168] Carl Scheffler and Jean-Marc Odobez. Joint adaptive colour modelling and skin, hair and clothing segmentation using coherent probabilistic index maps. In British Machine Vision Conference, Dundee, UK, 2011. [ bib ]
[1169] Ramya Rasipuram and Mathew Magimai.-Doss. Integrating articulatory features using kullback-leibler divergence based acoustic model for phoneme recognition. In Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP, pages 5192–5195, 2011. [ bib ]
[1170] David Imseng, Hervé Bourlard, John Dines, Philip N. Garner, and Mathew Magimai.-Doss. Improving non-native asr through stochastic multilingual phoneme space transformations. In Proceedings of Interspeech, Florence, Italy, 2011. [ bib ]
[1171] Ramya Rasipuram and Mathew Magimai.-Doss. Improving articulatory feature and phoneme recognition using multitask learning. In Artificial Neural Networks and Machine Learning - ICANN 2011, pages 299–306. Springer Berlin / Heidelberg, 2011. [ bib ]
[1172] Joan-Isaac Biel and Daniel Gatica-Perez. Call me guru: user categories and large-scale behavior in youtube. In Social Media Computing. Springer, 2011. [ bib ]
[1173] Dinesh Babu Jayagopi, Taemie Kim, Alex Pentland, and Daniel Gatica-Perez. Privacy-sensitive recognition of group conversational context with sociometers. Springer Multimedia Systems Journal, 2011. [ bib ]
[1174] Dairazalia Sanchez-Cortes, Oya Aran, Marianne Schmid Mast, and Daniel Gatica-Perez. Detecting emergent leaders in small groups using nonverbal behavior. IEEE Transactions on Multimedia, 2011. [ bib ]
[1175] Philip N. Garner. Cepstral normalisation and the signal to noise ratio spectrum in automatic speech recognition. Speech Communication, 53(8):991–1001, 2011. [ bib ]
[1176] Andrei Popescu-Belis and Sandrine Zufferey. Automatic identification of discourse markers in multiparty dialogues: An in-depth study of like and well. Computer Speech and Language, 25(3):499–518, 2011. [ bib ]
[1177] Jagannadan Varadarajan, Remi Emonet, and Jean-Marc Odobez. A sequential topic model for mining recurrent activities from video and audio data logs. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2011. [ bib ]
[1178] Karim Ali, Francois Fleuret, David Hasler, and Pascal Fua. A real-time deformable detector. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011. [ bib ]
[1179] Cong-Thanh Do, Dominique Pastor, and André Goalic. A novel framework for noise robust asr using cochlear implant-like spectrally reduced speech. Speech Communication, 2011. [ bib ]
[1180] Cheng Chen, Yi Yang, Feiping Nie, and Jean-Marc Odobez. 3d human pose recovery from image by efficient visual feature selection. Computer Vision and Image Understanding, 115(3), 2011. [ bib ]
[1181] Jerome Berclaz, Engin Turetken, Francois Fleuret, and Pascal Fua. Multiple object tracking using k-shortest paths optimization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011. [ bib ]
[1182] Cheng Chen. Learning a 3d human pose distance metric from geometric pose descriptor. IEEE Transactions on Visualization and Computer Graphics, 2011. [ bib ]
[1183] Lakshmi Saheer, John Dines, and Philip N. Garner. Vocal tract length normalization for statistical parametric speech synthesis. IEEE transactions on audio, speech and langugae processing, 2011. [ bib ]
[1184] P. Dillenbourg, G. Zufferey, H. S. Alavi, P. Jermann, S. Do Lenh, Q. Bonnard, S. Cuendet, and F. Kaplan. Classroom orchestration: The third circle of usability. In Proceedings of the 9th Computer-Supported Collaborative Learning Conference, Hong Kong, 2011. [ bib ]
[1185] P. Vajda, I. Ivanov, L. Goldmann, and T. Ebrahimi. Let Epitome summarize your photo collection! In Proc. International Conference on Multimedia and Expo (ICME'11), Barcelona, Spain, 2011. [ bib ]
[1186] P. Vajda, I. Ivanov, L. Goldmann, and T. Ebrahimi. Social game Epitome vesus automatic visual analysis. In Proc. International Conference on Multimedia and Expo (ICME'11), Barcelona, Spain, 2011. [ bib ]
[1187] P. Vajda, I. Ivanov, L. Goldmann, and T. Ebrahimi. Omnidirectional object duplicate detection. In Proc. International Workshop on Digital Signal Processing (DSPE'11), pages 332–337, Sedona, Arizona, USA, 2011. [ bib ]
[1188] J. S. Lee, F. De Simone, and T. Ebrahimi. Subjective quality assessment of scalable video coding. In Proc. International Workshop on Quality of Multimedia Experience (QoMEX'11), Mechelen, Belgium, 2011. [ bib ]
[1189] J. S. Lee, F. De Simone, and T. Ebrahimi. Subjective quality evaluation of foveated video coding using audio-visual focus of attention. IEEE Journal of Selected Topics in Signal Processing, 2011. [ bib ]
[1190] J. S. Lee, F. De Simone, and T. Ebrahimi. Subjective quality evaluation via paired comparison: application to scalable video coding. IEEE Transactions on Multimedia, 2011. [ bib ]
[1191] F. DeSimone, M. Naccari, M.Tagliasacchi, F. Dufaux, S. Tubaro, and T. Ebrahimi. Subjective quality assessment of h.264/avc video streaming with packet losses. Eurasip Journal on Image and Video Processing, 2011 Article ID 190431, 2011. [ bib ]
[1192] F. DeSimone, L. Goldmann, J. S. Lee, and T. Ebrahimi. Towards high efficiency video coding: subjective evaluation of potential coding technologies. Journal of Visual Communication and Image Representation, 2011. [ bib ]
[1193] F. DeSimone, L. Goldmann, J. S. Lee, and T. Ebrahimi. Performance analysis of vp8 image and video compression based on subjective evaluations. In SPIE Optics and Photonics, Applications of Digital Image Processing XXXIV, 8135, 2011. [ bib ]
[1194] S. Koelstra, C. Muehl, M. Soleymani, J. S. Lee, A. Yazdani, T. Ebrahimi, T. Pun, A. Nijholt, and I. Patras. DEAP: a database for emotion analysis using physiological signals. IEEE Trans. Affective Computing, 2011. [ bib ]
[1195] I. Ivanov, P. Vajda, J. S. Lee, and T. Ebrahimi. In tags we trust: Trust modeling in social tagging of multimedia content. IEEE Signal Processing Magazine, 2011. [ bib ]
[1196] Denis Lalanne and Agnes Lisowska Masson. A fitt of distraction: measuring the impact of distracters and multi-users on pointing efficiency. In CHI Extended Abstracts, pages 2125–2130, 2011. [ bib ]
[1197] Dalila Mekhaldi, Denis Lalanne, and Rolf Ingold. A multimodal alignment framework for spoken documents. International Journal of Multimedia Tools and Applications, 2011. [ bib ]
[1198] Ilya Boyandin, Enrico Bertini, Peter Bak, and Denis Lalanne. Flowstrates: An approach for visual exploration of temporal origin-destination data. Computer Graphics Forum, 30(3):971–980, 2011. [ bib ]
[1199] Stefano Carrino, Elena Mugellini, Omar Abou Khaled, and Rolf Ingold. Aramis: Toward a hybrid approach for human- environment interaction. In HCI (3), pages 165–174, 2011. [ bib ]
[1200] Francesco Carrino, Julien Tscherrig, Elena Mugellini, Omar Abou Khaled, and Rolf Ingold. Head-computer interface: A multimodal approach to navigate through real and virtual worlds. In HCI (2), pages 222–230, 2011. [ bib ]
[1201] Andrei Popescu-Belis, Denis Lalanne, and Hervé Bourlard. Finding information in multimedia records of meetings. Multimedia, IEEE, 2011. [ bib ]
[1202] D. Morrison, E. Bruno, and S. Marchand-Maillet. Query log simulation for long-term learning in image retrieval. In ontent-based Multimedia Indexinding (CBMI'11), 2011. [ bib ]
[1203] M. von Wyl, H. Mohamed, E. Bruno, and S. Marchand-Maillet. A parallel cross-modal search engine over large-scale multimedia collections with interactive relevance feedback. In ACM International Conference on Multimedia Retrieval (ACM-ICMR'11), 2011. [ bib ]
[1204] J. Kludas and S. Marchand-Maillet. Effective multimodal information fusion by structure learning. In 14th International Conference on Information Fusion (FUSION 2011), 2011. [ bib ]
[1205] M. Soleymani, S. Koelstra, I. Patras, and T. Pun. Continuous emotion detection in response to music videos. In EmoSPACE 2011, 1st Int. Workshop on Emotion Synthesis, rePresentation, and Analysis in Continuous spacE, in conjunction with IEEE FG 2011, 2011. [ bib ]
[1206] M. Larson, M. Soleymani, P. Serdyukov, S. Rudinac, C. Wartena, G. Friedland, V. Murdock, R. Ordelman, and G. J. F. Jonesv. Automatic tagging and geo-tagging in video collections and communities. In ACM Int. Conf. on Multimedia Retrieval (ICMR) 2011, 2011. [ bib ]
[1207] J. Gomez, G. Bologna, B. Deville, and T. Pun. Multisource sonification for visual substitution in an auditory memory game: one, or two fingers? In ICAD 2011, Int. Conf. on Auditory Display, 2011. [ bib ]
[1208] G. Chanel, C. Rebetez, M. Betrancourt, and Pun T. Emotion assessment from physiological signals for adaptation of games difficulty. IEEE Trans. on Systems, Man, and Cybernetics - Part A: Systems and Humans, 2011. [ bib ]
[1209] S. Koelstra, C. Mühl, M. Soleymani, J. S. Lee, A. Yazdani, T. Ebrahimi, T. Pun, A. Nijholt, and I. Patras. DEAP: A database for emotion analysis using physiological signal. IEEE Trans. on Affective Computing, Special Issue on Naturalistic Affect Resources for System Building and Evaluation, 2011. [ bib ]
[1210] A. Yüce, M. Sorci, and J. Ph. Thiran. Head pose detection using fast robust pca for side active appearance models under occlusion. In International Conference on Image Processing, Computer Vision, and Pattern Recognition (IPCV 2011), 2011. [ bib ]
[1211] Anil Yüce, M. Sorci, and J. Ph. Thiran. Head pose detection using Fast Robust PCA for Side Active Appearance Models under Occlusion. In Proceeding of the The 2011 International Conference on Image Processing, Computer Vision, and Pattern Recognition (IPCC 2011), 2011. [ bib ]
[1212] N. Razavi, J. Gall, and L. Van Gool. Scalable multi-class object detection. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), 2011. [ bib ]
[1213] M. Mathias, A. Martinovic, J. Weissenberg, S. Haegler, and L. Van Gool. Automatic architectural style recognition. In 3D-ARCH 2011: �3D Virtual Reconstruction and Visualization of Complex Architecture, 2011. [ bib ]
[1214] J. Gall, A. Fossati, and L. Van Gool. Functional categorization of objects using real-time markerless motion capture. In Computer Vision and Pattern Recognition (CVPR), 2011. [ bib ]
[1215] H. Hamer, J. Gall, R. Urtasun, and L. Van Gool. Data-driven animation of hand-object interactions. In IEEE Conference on Automatic Face and Gesture Recognition, 2011. [ bib ]
[1216] G. Fanelli, J. Gall, and L. Van Gool. Real time head pose estimation with random regression forest. In Computer Vision and Pattern Recognition (CVPR), 2011. [ bib ]
[1217] G. Aschwanden, S. Haegler, F. Bosché, L. Van Gool, and G. Schmitt. Empiric design evaluation in urban planning. Automation in Construction, 20(3):299–310, 2011. [ bib ]
[1218] A. Lehmann, B. Leibe, and L. Van Gool. Fast PRISM: Branch and bound hough transform for object class detection. International Journal of Computer Vision, 94(2):175–197, 2011. [ bib ]
[1219] K. Moustakas, D. Tzovaras, L. Dybkjaer, N. Bernsen, and Oya Aran. Using modality replacement to facilitate communication between visually and hearing-impaired people. IEEE Multimedia, 18(2):26–37, February 2011. [ bib | DOI | Abstract ]
[1220] Daniel Gatica-Perez, Edgar Roman-Rangel, Jean-Marc Odobez, and Carlos Pallan. New world, new worlds: Visual analysis of pre-columbian pictorial collections. In Proceedings of the International Workshop on Multimedia for Cultural Heritage. Springer CCIS series book, April 2011. [ bib | .pdf | Abstract ]
[1221] Danil Korchagin. Audio spatio-temporal fingerprints for cloudless real-time hands-free diarization on mobile devices. In Proceedings of the 3rd Joint Workshop on Hands-Free Speech Communication and Microphone Arrays, May 2011. [ bib | .pdf | Abstract ]
[1222] J. S. Lee and T. Ebrahimi. Audio-visual synchronization recovery in multimedia content. In Proc. International Conference on Acoustics, Speech and Signal Processing (ICASSP'11), pages 2280–2283, Prague, Czech Republic, May 2011. [ bib ]
[1223] Francesco Orabona and Jie Luo. Ultra-fast optimization algorithm for sparse multi kernel learning. In Proceedings of the 28th International Conference on Machine Learning, June 2011. [ bib | .pdf | Abstract ]
[1224] Thomas Meyer, Andrei Popescu-Belis, Sandrine Zufferey, and Bruno Cartoni. Multilingual annotation and disambiguation of discourse connectives for machine translation. In Proceedings of 12th SIGdial Meeting on Discourse and Dialogue, pages 194–203. Association for Computational Linguistics, June 2011. [ bib | .pdf | Abstract ]
[1225] Thomas Meyer. Disambiguating temporal-contrastive discourse connectives for machine translation. In Proceedings of ACL-HLT 2011 Student Session, pages 46–51. Association for Computational Linguistics, June 2011. [ bib | .pdf | Abstract ]
[1226] Bruno Cartoni, Sandrine Zufferey, Thomas Meyer, and Andrei Popescu-Belis. How comparable are parallel corpora? measuring the distribution of general vocabulary and connectives. In Proceedings of 4th Workshop on Building and Using Comparable Corpora, pages 78–86. ACL, June 2011. [ bib | .pdf | Abstract ]
[1227] Thomas Meyer, Charlotte Roze, Bruno Cartoni, Laurence Danlos, Sandrine Zufferey, and Andrei Popescu-Belis. Disambiguating discourse connectives using parallel corpora: senses vs. translations. In Proceedings of Corpus Linguistics Conference, pages 104–105, July 2011. [ bib | .pdf ]
[1228] Danil Korchagin, Petr Motlicek, Stefan Duffner, and Hervé Bourlard. Just-in-time multimodal association and fusion from home entertainment. In Proceedings IEEE International Conference on Multimedia & Expo, July 2011. [ bib | .pdf | Abstract ]
[1229] Bruno Cartoni and Thomas Meyer. Building 'directional corpora' for unbiased contrastive analysis. In Proceedings of Corpus Linguistics Conference, pages 29–30, July 2011. [ bib | .pdf ]
[1230] Patrick Marmaroli, Jean-Marc Odobez, Xavier Falourd, and Hervé Lissek. A bimodal sound source model for vehicle tracking in traffic monitoring. In European Signal Processing Conference, August 2011. [ bib | .pdf ]
[1231] Danil Korchagin. Impact of excitation frequency on short-term recording synchronisation and confidence estimation. In Proceedings European Signal Processing Conference, August 2011. [ bib | .pdf | Abstract ]
[1232] Remi Emonet, Jagannadan Varadarajan, and Jean-Marc Odobez. Multi-camera open space human activity discovery for anomaly detection. In 8th IEEE International Conference on Advanced Video and Signal-Based Surveillance, August 2011. [ bib | .pdf | Abstract ]
[1233] Hamid Reza Abutalebi, Hedieh Heli, Danil Korchagin, and Hervé Bourlard. A bss-based approach for localization of simultaneous speakers in reverberant conditions. In Proceedings of the 19th European Signal Processing Conference (EUSIPCO), August 2011. [ bib | .pdf ]
[1234] Edgar Roman-Rangel, Carlos Pallan, Jean-Marc Odobez, and Daniel Gatica-Perez. Analyzing ancient maya glyph collections with contextual shape descriptors. International Journal of Computer Vision, 94(1):101–117, August 2011. Special Issue in Cultural Heritage and Art Preservation. Online first, Oct-2010. [ bib | DOI | .pdf | Abstract ]
[1235] J. S. Lee, F. De Simone, and T. Ebrahimi. Subjective Quality Assessment of Scalable Video Coding: A Survey. In Proceedings of the International Workshop on Quality of Multimedia Experience, pages 25–30, September 2011. [ bib ]
[1236] Thomas Meyer, Andrei Popescu-Belis, Jeevanthi Liyanapathirana, and Bruno Cartoni. A corpus-based contrastive analysis for defining minimal semantics of inter-sentential dependencies for machine translation. In Proceedings of the GSCL2011 Workshop on "Contrastive Analysis - Translation Studies - Machine Translation: What can we learn from each other?", page 5, September 2011. [ bib | .pdf | Abstract ]
[1237] Remi Emonet. Environment - application - adaptation: a community architecture for ambient intelligence. In International Conference on Ambient Computing, Applications, Services and Technologies, October 2011. [ bib | Abstract ]
[1238] Murali Mohan Chakka, André Anjos, Sébastien Marcel, Roberto Tronci, Daniele Muntoni, Gianluca Fadda, Maurizio Pili, Nicola Sirena, Gabriele Murgia, Marco Ristori, Fabio Roli, Junjie Yan, Dong Yi, Zhen Lei, Zhiwei Zhang, Stan Z.Li, William Robson Schwartz, Anderson Rocha, Helio Pedrini, Javier Lorenzo-Navarro, Modesto Castrillón-Santana, Jukka Maatta, Abdenour Hadid, and Matti Pietikainen. Competition on counter measures to 2-d facial spoofing attacks. In Proceedings of IAPR IEEE International Joint Conference on Biometrics (IJCB), Washington DC, USA, October 2011. [ bib | .pdf | Abstract ]
[1239] André Anjos and Sébastien Marcel. Counter-measures to photo attacks in face recognition: a public database and a baseline. In International Joint Conference on Biometrics 2011, October 2011. [ bib | .pdf | Abstract ]
[1240] Jian Yao and Jean-Marc Odobez. Fast human detection from joint appearance and foreground feature subset covariances. Computer Vision and Image Understanding, 115(10):1414–1426, October 2011. [ bib | .pdf | Abstract ]
[1241] Hervé Bourlard, John Dines, Mathew Magimai.-Doss, Philip N. Garner, David Imseng, Petr Motlicek, Hui Liang, Lakshmi Saheer, and Fabio Valente. Current trends in multilingual speech processing. Sadhana, 36(5):885�915, October 2011. [ bib | .pdf | Abstract ]
[1242] Jie Luo, Tatiana Tommasi, and Barbara Caputo. Multiclass transfer learning from unconstrained priors. In Proceedings of the 13th International Conference on Computer Vision, November 2011. [ bib | .pdf | Abstract ]
[1243] Dairazalia Sanchez-Cortes, Oya Aran, and Daniel Gatica-Perez. An audio visual corpus for emergent leader analysis. In Multimodal Corpora for Machine Learning: Taking Stock and Road mapping the Future, Workshop, November 2011. [ bib | Abstract ]
[1244] Edgar Roman-Rangel, Carlos Pallan, Jean-Marc Odobez, and Daniel Gatica-Perez. Searching the past: An improved shape descriptor to retrieve maya hieroglyphs. In Proceedings of the ACM International Conference in Multimedia. ACM, November 2011. [ bib | .pdf | Abstract ]
[1245] Charles Dubout and Francois Fleuret. Tasting families of features for image classification. In International Conference on Computer Vision, November 2011. [ bib | .pdf | Abstract ]
[1246] Trinh-Minh-Tri Do, Jan Blom, and Daniel Gatica-Perez. Smartphone usage in the wild: a large-scale analysis of applications and context. In 13th International Conference on Multimodal Interaction, November 2011. [ bib | .pdf | Abstract ]
[1247] Sree Hari Krishnan Parthasarathi, Daniel Gatica-Perez, Hervé Bourlard, and Mathew Magimai.-Doss. Privacy-sensitive audio features for speech/nonspeech detection. IEEE Transactions on Audio, Speech, and Language Processing, 19(8), November 2011. [ bib | .pdf | Abstract ]
[1248] P. Vajda, I. Ivanov, J. S. Lee, and T. Ebrahimi. Epitomize Your Photos. International Journal of Computer Games Technology, 2011(706893), December 2011. [ bib ]
[1249] Daniel Povey, Arnab Ghoshal, Gilles Boulianne, Lukas Burget, Ondrej Glembek, Nagendra Goel, Mirko Hannemann, Petr Motlicek, Yanmin Qian, Petr Schwarz, Jan Silovsky, Georg Stemmer, and Karel Vesely. The kaldi speech recognition toolkit. In IEEE 2011 Workshop on Automatic Speech Recognition and Understanding. IEEE Signal Processing Society, December 2011. IEEE Catalog No.: CFP11SRW-USB. [ bib | .pdf | Abstract ]
[1250] David Imseng, Ramya Rasipuram, and Mathew Magimai.-Doss. Fast and flexible kullback-leibler divergence based acoustic modeling for non-native speech recognition. In Proceedings of the IEEE workshop on Automatic Speech Recognition and Understanding, pages 348–353, December 2011. [ bib | .pdf | Abstract ]
[1251] Sandrine Zufferey, Liesbeth Degand, Andrei Popescu-Belis, and Ted Sanders. Empirical validations of multilingual annotation schemes for discourse relations. In 8th Joint ACL-ISO Workshop on Interoperable Semantic Annotation, 2012. [ bib | .pdf ]
[1252] Alessandro Vinciarelli, Samuel Kim, Fabio Valente, and Hugues Salamin. Collecting data for socially intelligent surveillance and monitoring approaches: the case of conflict in competitive conversations. In International Symposium on Communications, Control, and Signal Processing, 2012. [ bib | .pdf ]
[1253] Deepu Vijayasenan and Fabio Valente. Diartk : An open source toolkit for research in multistream speaker diarization and its application to meetings recordings. In Proceedings of Interspeech, 2012. [ bib | .pdf ]
[1254] Deepu Vijayasenan and Fabio Valente. Speaker diarization of meetings based on large tdoa feature vectors. In Proceedings of International Conference on Acoustic, Speech and Signal Processing, 2012. [ bib | .pdf ]
[1255] Fabio Valente, Samuel Kim, and Petr Motlicek. Annotation and recognition of personality traits in spoken conversations from the ami meetings corpus. In Proceedings of Interspeech 2012, 2012. [ bib | .pdf ]
[1256] Fabio Valente and Petr Motlicek. Detecting and labeling folk literature in spoken cultural heritage archives using structural and prosodic features. In IEEE Content Based Multimedia Indexing, 2012. [ bib | .pdf ]
[1257] Tatiana Tommasi, Novi Quadrianto, Barbara Caputo, and Christoph H. Lampert. Beyond dataset bias: Multi-task unaligned shared knowledge transfer. In Asian Conference on Computer Vision, 2012. [ bib | .pdf ]
[1258] Sunghyun Park, Gelareh Mohammadi, Ron Artstein, and Louis-Philippe Morency. Crowdsourcing micro-level multimedia annotations: The challenges of evaluation and interface. In Proceedings of International ACM Workshop on Crowdsourcing for Multimedia, 2012. [ bib ]
[1259] Laurent Son Nguyen, Jean-Marc Odobez, and Daniel Gatica-Perez. Using self-context for multimodal detection of head nods in face-to-face interactions. In Proceedings of the 14th ACM International Conference on Multimodal Interaction, 2012. [ bib | .pdf ]
[1260] Jesus Martinez-Gomez, Ismael Garcia-Varea, and Barbara Caputo. Baseline multimodal place classifier for the 2012 robot vision task. In Working Notes of the ImageCLEF 2012 Laboratory, 2012. [ bib | .pdf ]
[1261] Jesus Martinez-Gomez, Ismael Garcia-Varea, and Barbara Caputo. Overview of the imageclef 2012 robot vision task. In Working Notes of the ImageCLEF 2012 Laboratory, 2012. [ bib | .pdf ]
[1262] Leonidas Lefakis and Francois Fleuret. Macro-action discovery based on change point detection and boosting. In International Conference on Machine Learning and Applications, 2012. [ bib | .pdf ]
[1263] David Klotz, Johannes Wienke, Britta Wrede, Sebastian Wrede, Samira Sheikhi, Dinesh Babu Jayagopi, Vasil Khalidov, and Jean-Marc Odobez. Robot-to-group interaction in a vernissage: Architecture & dataset for multi-party dialog. In Proceedings of 5th International Conference on Cognitive Systems, 2012. [ bib | .pdf ]
[1264] Samuel Kim, Maurizio Filippone, Fabio Valente, and Alessandro Vinciarelli. Predicting the conflict level in television political debates: an approach based on crowdsourcing, nonverbal communication and gaussian processes. In ACM Multimedia, 2012. [ bib ]
[1265] Kyriaki Kalimeri, Bruno Lepri, Oya Aran, Dinesh Babu Jayagopi, Daniel Gatica-Perez, and Fabio Pianesi. Modeling dominance effects on nonverbal behaviors using granger causality. In Proceedings of International Conference on Multimodal Interaction, ICMI 2012, Santa Monica, CA, 2012. [ bib | .pdf | Abstract ]
[1266] Dinesh Babu Jayagopi, Dairazalia Sanchez-Cortes, Kazuhiro Otsuka, Junji Yamato, and Daniel Gatica-Perez. Linking speaking and looking behavior patterns with group composition, perception, and performance. In Proceedings of the International Conference on Multimodal Interaction (ICMI), Santa Monica, USA, 2012. [ bib | .pdf | Abstract ]
[1267] Marc Ferras and Hervé Bourlard. Speaker diarization and linking of large corpora. In Proceedings of the IEEE Workshop on Spoken Language Technology, 2012. [ bib | .pdf ]
[1268] Charles Dubout and Francois Fleuret. Exact acceleration of linear object detectors. In Proceedings of the European Conference on Computer Vision, 2012. [ bib | .pdf | Abstract ]
[1269] Cheng Chen and Jean-Marc Odobez. We are not contortionists: Coupled adaptive learning for head and body orientation estimation in surveillance video. In IEEE International Conference on Computer Vision and Pattern Recognition, 2012. [ bib | .pdf ]
[1270] Joan-Isaac Biel, Lucia Teijeiro-Mosquera, and Daniel Gatica-Perez. Facetube: predicting personality from facial expressions of emotion in online conversational video. In Proceedings International Conference on Multimodal Interfaces (ICMI-MLMI), 2012. [ bib | .pdf | Abstract ]
[1271] Manfredo Atzori, Arjan Gijsberts, Simone Heynen, Anne-Gabrielle Mittaz Hager, Claudio Castellini, Barbara Caputo, and Henning Müller. Experiences in the creation of an electromyography database to help hand amputated persons. In Proceedings of the 24th European Medical Informatics Conference, 2012. [ bib | .pdf ]
[1272] Jason Weston, Frédéric Ratle, Hossein Mobahi, and Ronan Collobert. Deep learning via semi-supervised embedding. In Grégoire Montavon, Geneviève Orr, and K. R. Müller, editors, In Neural Networks: Tricks of the Trade. Springer, second edition, 2012. [ bib | .pdf | Abstract ]
[1273] Ronan Collobert, Koray Kavukcuoglu, and Clément Farabet. Implementing neural networks efficiently. In Grégoire Montavon, Geneviève Orr, and K. R. Müller, editors, Neural Networks: Tricks of the Trade. Springer, second edition, 2012. [ bib | .pdf | Abstract ]
[1274] Jagannadan Varadarajan, Remi Emonet, and Jean-Marc Odobez. A sequential topic model for mining recurrent activities from long term video logs. International Journal of Computer Vision, 2012. [ bib | .pdf | Abstract ]
[1275] Sree Hari Krishnan Parthasarathi, Hervé Bourlard, and Daniel Gatica-Perez. Wordless sounds: Robust speaker diarization using privacy-preserving audio representations. IEEE Transactions on Audio, Speech, and Language Processing, 2012. [ bib | .pdf | Abstract ]
[1276] Philip N. Garner, Milos Cernak, and Petr Motlicek. A simple continuous pitch estimation algorithm. IEEE Signal Processing Letters, 2012. [ bib | http | .pdf | Abstract ]
[1277] Joan-Isaac Biel and Daniel Gatica-Perez. The youtube lens: Crowdsourced personality impressions and audiovisual analysis of vlogs. IEEE Transactions on Multimedia, 2012. [ bib | .pdf | Abstract ]
[1278] Afsaneh Asaei, Hervé Bourlard, Bhiksha Raj, and Volkan Cevher. Optimal structured sparse coding for spatio-spectral information recovery. IEEE Transactions on Audio, Speech and Language Processing, 2012. [ bib | Abstract ]
[1279] Afsaneh Asaei, Hervé Bourlard, and Volkan Cevher. A method, apparatus and computer program for determining the location of a plurality of speech source. 2012US-13/654055, 2012. [ bib ]
[1280] A. Sapru and Hervé Bourlard. Automatic social role recognition in professional meetings. Idiap-RR Idiap-RR-35-2012, Idiap, 2012. [ bib | .pdf ]
[1281] Ramya Rasipuram, Peter Bell, and Mathew Magimai.-Doss. Grapheme and multilingual posterior features for under-resource speech recognition: A study on scottish gaelic. Idiap-RR Idiap-RR-34-2012, Idiap, 2012. [ bib | .pdf | Abstract ]
[1282] Hugo Penedones, Ronan Collobert, Francois Fleuret, and David Grangier. Improving object classification using pose information. Idiap-RR Idiap-RR-30-2012, Idiap, 2012. [ bib | .pdf | Abstract ]
[1283] Sree Hari Krishnan Parthasarathi, Hervé Bourlard, and Daniel Gatica-Perez. Wordless sounds: Robust speaker diarization using privacy-preserving audio representations. Idiap-RR Idiap-RR-28-2012, Idiap, 2012. [ bib | .pdf | Abstract ]
[1284] Nikolaos Pappas and Thomas Meyer. A survey on language modeling using neural networks. Idiap-RR Idiap-RR-32-2012, Idiap, 2012. [ bib | .pdf | Abstract ]
[1285] Youssef Oualil, Mathew Magimai.-Doss, Friedrich Faubel, and Dietrich Klakow. A probabilistic framework for multiple speaker localization. Idiap-RR Idiap-RR-37-2012, Idiap, 2012. Submitted to ICASSP'13. [ bib | .pdf | Abstract ]
[1286] Laurent Son Nguyen, Jean-Marc Odobez, and Daniel Gatica-Perez. Using self-context for multimodal detection of head nods in face-to-face interactions. Idiap-RR Idiap-RR-27-2012, Idiap, 2012. [ bib | .pdf ]
[1287] Petr Motlicek, Fabio Valente, and Igor Szoke. Improving acoustic based keyword spotting using lvcsr lattices. Idiap-RR Idiap-RR-36-2012, Idiap, Rue Marconi 19, 2012. [ bib | .pdf | Abstract ]
[1288] Thomas Meyer. Translation error spotting from a user's point of view. Idiap-RR Idiap-RR-31-2012, Idiap, 2012. EPFL course project paper. [ bib | .pdf | Abstract ]
[1289] Hui Liang. Data-driven enhancement of state mapping-based cross-lingual speaker adaptation. Idiap-RR Idiap-RR-38-2012, Idiap, 2012. [ bib | .pdf ]
[1290] Dinesh Babu Jayagopi, Samira Sheikhi, David Klotz, Johannes Wienke, Jean-Marc Odobez, Sebastian Wrede, Vasil Khalidov, Laurent Son Nguyen, Britta Wrede, and Daniel Gatica-Perez. The vernissage corpus: A multimodal human-robot-interaction dataset. Idiap-RR Idiap-RR-33-2012, Idiap, 2012. [ bib | .pdf | Abstract ]
[1291] Manuel Günther, Roy Wallace, and Sébastien Marcel. An open source framework for standardized comparisons of face recognition algorithms. Idiap-RR Idiap-RR-29-2012, Idiap, 2012. [ bib | .pdf | Abstract ]
[1292] D. Morrison, T. Tsikrika, V. Hollink, A. P. de Vries, E. Bruno, and S. Marchand-Maillet. Topic modelling of clickthrough data in image search. Multimedia Tools and Applications, 2012. (to appear). [ bib ]
[1293] S. Koelstra, C. M�hl, M. Soleymani, Jong-Seok Lee, A. Yazdani, T. Ebrahimi, T. Pun, A. Nijholt, and I. Patras. Deap: A database for emotion analysis using physiological signals. IEEE Trans. on Affective Computing, Special Issue on Naturalistic Affect Resources for System Building and Evaluation, 3(1), 2012. [ bib ]
[1294] M. Soleymani, J. Lichtenauer, T. Pun, and M. Pantic. A multi-modal database for affect recognition and implicit tagging. IEEE Trans. on Affective Computing, Special Issue on Naturalistic Affect Resources for System Building and Evaluation, 3(1), 2012. [ bib ]
[1295] M. Soleymani, M. Pantic, and T. Pun. Multi-modal emotion recognition in response to videos. IEEE Trans. on Affective Computing, 2012. (accepted). [ bib ]
[1296] C. Ch�nes, G. Chanel, M. Soleymani, and T. Pun. Highlights detection in movie scenes through inter-users physiological linkage. In Jong-Seok Lee N. Ramzan, R. van Zwol, editor, Social Media Retrieval, 2012. [ bib ]
[1297] M. Larson, M. Soleymani, M. Eskevich, P. Serdyukov, and G. J. F. Jones. The community and the crowd: Developing large-scale data collections for multimedia benchmarking. IEEE Multimedia, Special Issue on Large-Scale Multimedia Data Collections, 2012. (in press). [ bib ]
[1298] Ilya Boyandin, Enrico Bertini, and Denis Lalanne. A qualitative study on the exploration of temporal changes in flow maps with animation and small-multiples. Eurographics/IEEE-VGTC Symposium on Visualization 2012, Computer Graphics Forum, International Journal of the Eurographics Association, 2012. [ bib ]
[1299] F. Ringeval, M. Chetouani, and B. Schuller. Novel metrics of speech rhythm for the assessment of emotion. In Interspeech, pages 4 pages, t, Portland, OR, 2012. [ bib ]
[1300] A. Perkis, J. You, L. Xing, T. Ebrahimi, F. De Simone, M. Rerabek, P. Nasiopoulos, Z. Mai, M. Pourazad, K. Brunnstrom, K. Wang, and B. Andren. Towards Certification of 3D Video Quality Assessment. In Proceedings of the 6th International Workshop on Video Processing and Quality Metrics for Consumer Electronics, January 2012. [ bib ]
[1301] S. Koelstra, C. Muhl, M. Soleymani, J. S. Lee, A. Yazdani, T. Ebrahimi, T. Pun, A. Nijholt, and I. Patras. DEAP: A Database for Emotion Analysis Using Physiological Signals. IEEE Transactions on Affective Computing, 3(1):18–31, 2012. [ bib ]
[1302] J. S. Lee, F. De Simone, N. Ramzan, E. Izquierdo, and T. Ebrahimi. Quality Assessment of Multidimensional Video Scalability. IEEE Communications Magazine, 50(4):38–46, 2012. [ bib ]
[1303] I. Ivanov, P. Vajda, J. S. Lee, L. Goldmann, and T. Ebrahimi. Geotag Propagation in Social Networks Based on User Trust Model. Multimedia Tools and Applications, 56(1):155–177, 2012. [ bib ]
[1304] I. Ivanov, P. Vajda, J. S. Lee, and T. Ebrahimi. In Tags We Trust: Trust Modeling in Social Tagging of Multimedia Content. IEEE Signal Processing Magazine, 29(2):98–107, 2012. [ bib ]
[1305] Jean-Louis Durrieu, Jean-Philippe Thiran, and Finnian Paul Kelly. Lower and upper bounds for approximation of the Kullback-Leibler divergence between Gaussian Mixture Models. In IEEE International Conference on Acoustics, Speech and Signal Processing, 2012. [ bib | .pdf | Abstract ]
[1306] Jean-Louis Durrieu and Jean-Philippe Thiran. Musical Audio Source Separation Based on User-Selected F0 Track. In 10th International Conference on Latent Variable Analysis and Signal Separation, 2012. [ bib | http | .pdf | Abstract ]
[1307] Virginia Estellers Casas, Mihai Gurban, and Jean-Philippe Thiran. On dynamic stream weighting for Audio-Visual Speech Recognition. Transactions on Audio, Speech, and Language Processing, 20(4):1145–1157, 2012. [ bib | DOI | .pdf | Abstract ]
[1308] Virginia Estellers Casas and Jean-Philippe Thiran. Multi-pose lipreading and audio-visual speech recognition. EURASIP Journal on Advances in Signal Processing, 51, 2012. [ bib | DOI | Abstract ]
[1309] Javier Cruz Mota, Iva Bogdanova Vandergheynst, Benoît Paquier, Michel Bierlaire, and Jean-Philippe Thiran. Scale Invariant Feature Transform on the Sphere: Theory and Applications. International Journal of Computer Vision, 98(2):217–241, 2012. [ bib | DOI | .pdf ]
[1310] Alessandro Vinciarelli, Hugues Salamin, Anna Polychroniou, Gelareh Mohammadi, and antonio origlia. From nonverbal cues to perception: Personality and social attractiveness. In LNCS Proceedings on COGNITIVE BEHAVIOURAL SYSTEMS. Springer, 2012. [ bib ]
[1311] Jagannadan Varadarajan, Remi Emonet, and Jean-Marc Odobez. Sparsity in topic models. In Practical Applications of Sparse Modeling: Biology, Signal Processing and Beyond. MIT Press, 2012. [ bib | .pdf ]
[1312] Fabio Valente and Gerald Friedland. Speaker diarization. In Multimodal Signal Processing: Human Interactions in Meetings. Cambridge University Press, 2012. [ bib | http ]
[1313] Simon Tucker and Andrei Popescu-Belis. Evaluation of meeting support technology. In Steve Renals, Hervé Bourlard, Jean Carletta, and Andrei Popescu-Belis, editors, Multimodal Signal Processing: Human Interactions in Meetings, pages 237–252. Cambridge University Press, Cambridge, UK, 2012. [ bib ]
[1314] Andrei Popescu-Belis and Jean Carletta. Multimodal signal processing for meetings: an introduction. In Steve Renals, Hervé Bourlard, Jean Carletta, and Andrei Popescu-Belis, editors, Multimodal Signal Processing: Human Interactions in Meetings, pages 1–11. Cambridge University Press, Cambridge, UK, 2012. [ bib | .pdf ]
[1315] Denis Lalanne and Andrei Popescu-Belis. User requirements for meeting support technology. In Steve Renals, Hervé Bourlard, Jean Carletta, and Andrei Popescu-Belis, editors, Multimodal Signal Processing: Human Interactions in Meetings, pages 210–221. Cambridge University Press, Cambridge, UK, 2012. [ bib ]
[1316] Steve Renals, Hervé Bourlard, Jean Carletta, and Andrei Popescu-Belis. Multimodal Signal Processing: Human Interactions in Meetings. Cambridge University Press, Cambridge, UK, 2012. [ bib | http ]
[1317] J. K. Laurila, Daniel Gatica-Perez, I. Aad, Blom J., Olivier Bornet, Trinh-Minh-Tri Do, O. Dousse, J. Eberle, and M. Miettinen. The mobile data challenge: Big data for mobile computing research. In Pervasive Computing, 2012. [ bib | .pdf | Abstract ]
[1318] Riwal Lefort and Francois Fleuret. A tree-based distance between distributions: application to classification of neurons. In ICASSP 2012 : IEEE International Conference on Acoustics, Speech and Signal Processing, 2012. [ bib ]
[1319] Gelareh Mohammadi, antonio origlia, Maurizio Pili, and Alessandro Vinciarelli. From speech to personality: Mapping voice quality and intonation into personality differences. In in Proceedings of ACM Multimedia 2012, 2012. [ bib | .pdf | Abstract ]
[1320] Ramya Rasipuram and Mathew Magimai.-Doss. Combining acoustic data driven g2p and letter-to-sound rules for under resource lexicon generation. In Proceedings of Interspeech, 2012. [ bib | .pdf | Abstract ]
[1321] A. Sapru and Fabio Valente. Automatic speaker role labeling in ami meetings: Recognition of formal and social roles. In Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing, Kyoto, Japan, 2012, 2012. [ bib | .pdf | Abstract ]
[1322] Tatiana Tommasi, Francesco Orabona, Mohsen Kaboli, and Barbara Caputo. Leveraging over prior knowledge for online learning of visual categories. In Proceedings of the British Machine Vision Conference, 2012. [ bib | .pdf | Abstract ]
[1323] Polzehl Tim, Schoenenberg Katrin, Moller Sebastian, Metze Florian, Gelareh Mohammadi, and Alessandro Vinciarelli. On speaker-independent personality perception and prediction from speech. In in Proceedings of INTERSPEECH 2012, 2012. [ bib | .pdf | Abstract ]
[1324] Yang Sun, Mathew Magimai.-Doss, Jort F. Gemmeke, B. Cranen, Louis ten Bosch, and Lou Boves. Combination of sparse classification and multilayer perceptron for noise robust asr. In Proceedings of Interspeech, 2012. [ bib | .pdf ]
[1325] Yang Sun, B. Cranen, Jort F. Gemmeke, Lou Boves, Louis ten Bosch, and Mathew Magimai.-Doss. Using sparse classification outputs as feature observations for noise robust asr. In Proceedings of Interspeech, 2012. [ bib | .pdf ]
[1326] Bjüorn Schuller, Stefan Steidl, Anton Batliner, Elmar Nüoth, Alessandro Vinciarelli, Felix Burkhardt, Rob Van Son, felix Weninger, Florian Eyben, Tobias Bocklet, Gelareh Mohammadi, and Benjamin Weiss. The interspeech 2012 speaker trait challenge. In in Proceedings of INTERSPEECH, 2012. [ bib ]
[1327] Sree Harsha Yella and Fabio Valente. Speaker diarization of overlapping speech based on silence distribution in meeting recordings. In INTERSPEECH, 2012. [ bib | Abstract ]
[1328] Ilja Kuzborskij, Arjan Gijsberts, and Barbara Caputo. On the challenge of classifying 52 hand movements from surface electromyography. In 34th Annual Conference of the IEEE Engineering in Medicine & Biology Society, 2012. [ bib | .pdf | Abstract ]
[1329] Danil Korchagin, Stefan Duffner, Petr Motlicek, and Carl Scheffler. Multimodal cue detection engine for orchestrated entertainment. In Proceedings International Conference on MultiMedia Modeling, January 2012. [ bib | .pdf | Abstract ]
[1330] Samuel Kim, Sree Harsha Yella, and Fabio Valente. Automatic detection of conflict escalation in spoken conversations. In INTERSPEECH. ISCA, 2012. [ bib ]
[1331] Elie Khoury, Antoine Laurent, Sylvain Meignier, and Simon Petitrenaud. Combining transcription-based and acoustic-based speaker identifications for broadcast news. In IEEE International Conference on Acoustics, Speech, and Signal Processing, 2012. [ bib | .pdf ]
[1332] Maryam Habibi and Andrei Popescu-Belis. Using crowdsourcing to compare document recommendation strategies for conversations. In RecSys, Recommendation Utility Evaluation (RUE 2012), 2012. [ bib | .pdf | Abstract ]
[1333] Joan-Isaac Biel and Daniel Gatica-Perez. The good, the bad, and the angry: Analyzing crowdsourced impressions of vloggers. In Proceedings of AAAI International Conference on Weblogs and Social Media, 2012. [ bib | .pdf | Abstract ]
[1334] Manfredo Atzori, Arjan Gijsberts, Simone Heynen, Anne-Gabrielle Mittaz Hager, Olivier Deriaz, Patrick van der Smagt, Claudio Castellini, Barbara Caputo, and Henning Müller. Building the ninapro database: a resource for the biorobotics community. In Proceedings of the Fourth IEEE RAS/EMBS International Conference on Biomedical Robotics and Biomechatronics, 2012. [ bib | .pdf ]
[1335] Afsaneh Asaei, Hervé Bourlard, and Volkan Cevher. Structured sparse component analysis of compressive acoustic measurements. In Proceeding of International Speech Communication Association, 2012. [ bib ]
[1336] Roy Wallace, Mitchell McLaren, Chris McCool, and Sébastien Marcel. Cross-pollination of normalisation techniques from speaker to face authentication using gaussian mixture models. IEEE Transactions on Information Forensics and Security, 7(2):553–562, 2012. [ bib | .pdf ]
[1337] Alessandro Vinciarelli, Maja Pantic, Dirk Heylen, C. Pelachaud, I. Poggi, F. D'Errico, and M. Schroeder. Bridging the gap between social animal and unsocial machine: A survey of social signal processing. IEEE Transactions on Affective Computing, 2012. [ bib ]
[1338] Deepu Vijayasenan, Fabio Valente, and Hervé Bourlard. Multistream speaker diarization of meetings recordings beyond mfcc and tdoa features. Speech Communication, 54(1), 2012. [ bib | DOI ]
[1339] Dairazalia Sanchez-Cortes, Oya Aran, Dinesh Babu Jayagopi, Marianne Schmid Mast, and Daniel Gatica-Perez. Emergent leaders through looking and speaking: from audio-visual data to multimodal recognition. Journal on Multimodal User Interfaces, 2012. [ bib | .pdf ]
[1340] Hugues Salamin and Alessandro Vinciarelli. Automatic role recognition in multiparty conversations: an approach based on turn organization, prosody and conditional random fields. IEEE Transactions on Multimedia, 2012. [ bib ]
[1341] Lakshmi Saheer, John Dines, and Philip N. Garner. Vocal tract length normalization for statistical parametric speech synthesis. IEEE Transactions on Audio, Speech and Language Processing, 2012. [ bib | .pdf | Abstract ]
[1342] A. Pesarin, M. Cristani, V. Murino, and Alessandro Vinciarelli. Conversation analysis at work: Detection of conflict in competitive discussions through automatic turn-organization analysis. Cognitive Processing, 2012. [ bib ]
[1343] R. Montoliu, J. Blom, and Daniel Gatica-Perez. Discovering places of interest in everyday life from smartphone data. Multimedia Tools and Applications, 2012. [ bib | .pdf | Abstract ]
[1344] Gelareh Mohammadi and Alessandro Vinciarelli. Automatic attribution of personality traits based on prosodic features. IEEE Transactions on Affective Computing, 2012. [ bib | .pdf ]
[1345] Elie Khoury, Christine Sénac, and Philippe Joly. Audiovisual diarization of people in video content. Multimedia Tools and Applications, 2012. [ bib ]
[1346] Trinh-Minh-Tri Do and Daniel Gatica-Perez. Human interaction discovery in smartphone proximity networks. Personal and Ubiquitous Computing, 2012. [ bib | .pdf | Abstract ]
[1347] Gokul Chittaranjan, Jan Blom, and Daniel Gatica-Perez. Mining large-scale smartphone data for personality studies. Personal and Ubiquitous Computing, 2012. [ bib | .pdf | Abstract ]
[1348] Afsaneh Asaei, Mohammad Golbabaee, Hervé Bourlard, and Volkan Cevher. Room acoustic modeling and speech dereverberation exploiting sparsity and low-rank structures. IEEE Transaction on Audio, Speech and Language Processing, 2012. [ bib | Abstract ]
[1349] Roy Wallace, Mitchell McLaren, Chris McCool, and Sébastien Marcel. Cross-pollination of normalisation techniques from speaker to face authentication using gaussian mixture models. Idiap-RR Idiap-RR-03-2012, Idiap, 2012. [ bib | .pdf ]
[1350] Tatiana Tommasi, Francesco Orabona, Claudio Castellini, and Barbara Caputo. Improving control of dexterous hand prostheses using adaptive learning. Idiap-RR Idiap-RR-07-2012, Idiap, 2012. [ bib | .pdf | Abstract ]
[1351] Serena Soldo and Mathew Magimai.-Doss. Integrating posterior features and self-organizing maps for isolated word recognition without dynamic programming. Idiap-RR Idiap-RR-17-2012, Idiap, 2012. [ bib | .pdf | Abstract ]
[1352] Lakshmi Saheer, Hui Liang, John Dines, and Philip N. Garner. Vtln-based rapid cross-lingual adaptation for statistical parametric speech synthesis. Idiap-RR Idiap-RR-12-2012, Idiap, 2012. [ bib | .pdf | Abstract ]
[1353] Lakshmi Saheer, Junichi Yamagishi, Philip N. Garner, and John Dines. Combining vocal tract length normalization with linear transformations in a bayesian framework. Idiap-RR Idiap-RR-11-2012, Idiap, 2012. [ bib | .pdf | Abstract ]
[1354] Sandrine Revaz and Milos Cernak. Baseline system for automatic speech recognition with french globalphone database. Idiap-RR Idiap-RR-26-2012, Idiap, 2012. [ bib | .pdf | Abstract ]
[1355] Sriram Prasath Elango, Tatiana Tommasi, and Barbara Caputo. Transfer learning of visual concepts across robots: a discriminative approach. Idiap-RR Idiap-RR-06-2012, Idiap, 2012. [ bib | .pdf | Abstract ]
[1356] Daniel Povey, Arnab Ghoshal, Gilles Boulianne, Lukas Burget, Ondrej Glembek, Nagendra Goel, Mirko Hannemann, Petr Motlicek, Yanmin Qian, Petr Schwarz, Jan Silovsky, Georg Stemmer, and Karel Vesely. The kaldi speech recognition toolkit. Idiap-RR Idiap-RR-04-2012, Idiap, Rue Marconi 19, Martigny, 2012. [ bib | .pdf | Abstract ]
[1357] Youssef Oualil, Friedrich Faubel, Mathew Magimai.-Doss, and Dietrich Klakow. A tdoa gaussian mixture model for improving acoustic source tracking. Idiap-RR Idiap-RR-10-2012, Idiap, 2012. [ bib | .pdf | Abstract ]
[1358] Youssef Oualil, Friedrich Faubel, and Dietrich Klakow. A multiple hypothesis gaussian mixture filter for acoustic source localization and tracking. Idiap-RR Idiap-RR-09-2012, Idiap, 2012. Submitted to IEEE SSP Workshop 2012. [ bib | .pdf | Abstract ]
[1359] Petr Motlicek, Philip N. Garner, David Imseng, and Fabio Valente. Application of subspace gaussian mixture models in contrastive acoustic scenarios. Idiap-RR Idiap-RR-20-2012, Idiap, Rue Marconi 19, Martigny, Switzerland, 2012. [ bib | .pdf | Abstract ]
[1360] Petr Motlicek, Laurent El Shafey, Roy Wallace, Chris McCool, and Sébastien Marcel. Bi-modal authentication in mobile environments using session variability modelling. Idiap-RR Idiap-RR-18-2012, Idiap, Rue Marconi 19, 2012. [ bib | .pdf | Abstract ]
[1361] Gelareh Mohammadi and Alessandro Vinciarelli. Towards a technology of nonverbal communication: Vocal behavior in social and affective phenomena. Idiap-RR Idiap-RR-05-2012, Idiap, 2012. [ bib | .pdf | Abstract ]
[1362] Chris McCool, Sébastien Marcel, Abdenour Hadid, Matti Pietikainen, Pavel Matejka, Jan Cernocky, Norman Poh, J. Kittler, Anthony Larcher, Christophe Levy, Driss Matrouf, Jean-François Bonastre, Phil Tresadern, and Timothy Cootes. Bi-modal person recognition on a mobile phone: using mobile phone data. Idiap-RR Idiap-RR-13-2012, Idiap, 2012. [ bib | .pdf | Abstract ]
[1363] Weifeng Li and Hervé Bourlard. Sub-band based log-energy and its dynamic range stretching for robust in-car speech recognition. Idiap-RR Idiap-RR-16-2012, Idiap, 2012. [ bib | .pdf ]
[1364] Gwénolé Lecorvé, John Dines, Thomas Hain, and Petr Motlicek. Impact du degré de supervision sur l'adaptation à un domaine d'un modèle de langage à partir du web. Idiap-RR Idiap-RR-23-2012, Idiap, 2012. in French. [ bib | .pdf | Abstract ]
[1365] Gwénolé Lecorvé, John Dines, Thomas Hain, and Petr Motlicek. Supervised and unsupervised web-based language model domain adaptation. Idiap-RR Idiap-RR-22-2012, Idiap, 2012. [ bib | .pdf | Abstract ]
[1366] Gwénolé Lecorvé and Petr Motlicek. Conversion of recurrent neural network language models to weighted finite state transducers for automatic speech recognition. Idiap-RR Idiap-RR-21-2012, Idiap, 2012. [ bib | .pdf | Abstract ]
[1367] David Imseng, Hervé Bourlard, and Philip N. Garner. Boosting under-resourced speech recognizers by exploiting out of language data - case study on afrikaans. Idiap-RR Idiap-RR-15-2012, Idiap, 2012. [ bib | .pdf | Abstract ]
[1368] David Imseng, Ramya Rasipuram, and Mathew Magimai.-Doss. Fast and flexible kullback-leibler divergence based acoustic modeling for non-native speech recognition. Idiap-RR Idiap-RR-01-2012, Idiap, 2012. [ bib | .pdf | Abstract ]
[1369] Maryam Habibi and Andrei Popescu-Belis. Using crowdsourcing to compare document recommendation strategies for conversations. Idiap-RR Idiap-RR-14-2012, Idiap, 2012. [ bib | .pdf | Abstract ]
[1370] Ivana Chingovska, André Anjos, and Sébastien Marcel. On the effectiveness of local binary patterns in face anti-spoofing. Idiap-RR Idiap-RR-19-2012, Idiap, 2012. [ bib | .pdf | Abstract ]
[1371] Milos Cernak, Philip N. Garner, and Petr Motlicek. Progress report of a project in very low bit-rate speech coding. Idiap-RR Idiap-RR-08-2012, Idiap, 2012. [ bib | .pdf | Abstract ]
[1372] Holger Caesar. Integrating language identification to improve multilingual speech recognition. Idiap-RR Idiap-RR-24-2012, Idiap, 2012. [ bib | .pdf | Abstract ]
[1373] Cosmin Atanasoaei, Chris McCool, and Sébastien Marcel. Face detection using boosted jaccard distance-based regression. Idiap-RR Idiap-RR-02-2012, Idiap, 2012. Submitted to CVPR 2011. [ bib | .pdf | Abstract ]
[1374] André Anjos, Laurent El Shafey, Roy Wallace, Manuel Guenther, Chris McCool, and Sébastien Marcel. Bob: a free signal processing and machine learning toolbox for researchers. Idiap-RR Idiap-RR-25-2012, Idiap, 2012. Submitted to the ACM MM 2012 Open Source Software Competition. [ bib | .pdf | Abstract ]
[1375] Thomas Hain, Lukas Burget, John Dines, Philip N. Garner, Frantisek Grezl, Asmaa El Hannani, Marijn Huijbregts, Martin Karafiat, Mike Lincoln, and Vincent Wan. Transcribing meetings with the amida systems. IEEE Transactions on Audio, Speech, and Language Processing, 20(2):486–498, February 2012. [ bib | DOI | http | .pdf | Abstract ]
[1376] Gerald Friedland, Adam Janin, David Imseng, Xavier Anguera, Luke Gottlieb, Marijn Huijbregts, Mary Tai Knox, and Oriol Vinyals. The icsi rt-09 speaker diarization system. IEEE Transactions on Audio, Speech, and Language Processing, 20(2):371–381, February 2012. [ bib | DOI | Abstract ]
[1377] Petr Motlicek, Fabio Valente, and Igor Szoke. Improving acoustic based keyword spotting using lvcsr lattices. In Proceedings on IEEE International Conference on Acoustics, Speech and Signal Processing, pages 4413–4416. IEEE, March 2012. [ bib | Abstract ]
[1378] Lakshmi Saheer, Junichi Yamagishi, Philip N. Garner, and John Dines. Combining vocal tract length normalization with hierarchial linear transformations. In Proceedings in International conference on Speech and Signal processing, pages 4493–4496. IEEE SPS, March 2012. [ bib | .pdf | Abstract ]
[1379] Ramya Rasipuram and Mathew Magimai.-Doss. Acoustic data-driven grapheme-to-phoneme conversion using kl-hmm. In Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing, March 2012. [ bib | .pdf | Abstract ]
[1380] Samuel Kim, Fabio Valente, and Alessandro Vinciarelli. Automatic detection of conflicts in spoken conversations: ratings and analysis of broadcast political debates. In Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing, March 2012. [ bib | .pdf | Abstract ]
[1381] David Imseng, Hervé Bourlard, and Philip N. Garner. Using kl-divergence and multilingual information to improve asr for under-resourced languages. In Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing, pages 4869–4872, March 2012. [ bib | .pdf | Abstract ]
[1382] Afsaneh Asaei, Michael E. Davies, Hervé Bourlard, and Volkan Cevher. Computational methods for structured sparse component analysis of convolutive speech mixtures. In Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing, March 2012. [ bib | Abstract ]
[1383] Thomas Meyer and Andrei Popescu-Belis. Using sense-labeled discourse connectives for statistical machine translation. In Proceedings of the EACL2012 Workshop on Hybrid Approaches to Machine Translation (HyTra), pages 129–138, April 2012. [ bib | .pdf | Abstract ]
[1384] Edgar Roman-Rangel, Jean-Marc Odobez, and Daniel Gatica-Perez. Assessing sparse coding methods for contextual shape indexing of maya hieroglyphs. Journal of Multimedia, 7(2):179–192, April 2012. [ bib | .pdf | Abstract ]
[1385] Dick C. A. Bulterman, Petr Motlicek, Stefan Duffner, and Danil Korchagin. Together Anywhere, Together Anytime, Technologies for Intimate Interactions. Centrum Wiskunde & Informatica, Amsterdam, Holland, dick c.a. bulterman, editor edition, May 2012. [ bib | Abstract ]
[1386] G. Chanel, M. Kivikangas, and N. Ravaja. Physiological compliance for social gaming analysis: cooperative versus competitive play. Interacting with Computers, May 2012. (this work was done while G. Chanel wat at the CKIR, Helsinki School of Economics, Finland). [ bib ]
[1387] Andrei Popescu-Belis, Thomas Meyer, Jeevanthi Liyanapathirana, Bruno Cartoni, and Sandrine Zufferey. Discourse-level annotation over europarl for machine translation: Connectives and pronouns. In Proceedings of the eighth international conference on Language Resources and Evaluation (LREC), page 5, May 2012. [ bib | .pdf | Abstract ]
[1388] David Imseng, Hervé Bourlard, and Philip N. Garner. Boosting under-resourced speech recognizers by exploiting out of language data - case study on afrikaans. In Proceedings of the 3rd International Workshop on Spoken Languages Technologies for Under-resourced Languages, pages 60–67, May 2012. [ bib | .pdf | Abstract ]
[1389] Bruno Cartoni and Thomas Meyer. Extracting directional and comparable corpora from a multilingual corpus for translation studies. In Proceedings of the eighth international conference on Language Resources and Evaluation (LREC), page 6, May 2012. [ bib | .pdf | Abstract ]
[1390] Katayoun Farrahi, Remi Emonet, and Alois Ferscha. Socio-technical network analysis from wearable interactions. In International Symposium on Wearable Computers, June 2012. [ bib | .pdf | Abstract ]
[1391] Jean-Marc Odobez and Oswald Lanz. Sampling techniques for audio-visual tracking and head pose estimation. In Multimodal Signal Processing: Human Interactions in Meetings, chapter 6, pages 84–102. Cambridge University Press, June 2012. [ bib | .pdf ]
[1392] Gwénolé Lecorvé, John Dines, Thomas Hain, and Petr Motlicek. Impact du degré de supervision sur l'adaptation à un domaine d'un modèle de langage à partir du web. In Actes de la conférence conjointe JEP-TALN-RECITAL 2012, volume 1, pages 193–200. ATALA/AFCP, June 2012. in French. [ bib | .pdf | Abstract ]
[1393] Nikolaos Pappas, Georgios Katsimpras, and Efstathios Stamatatos. Extracting informative textual parts from web pages containing user-generated content. In 12th International Conference on Knowledge Management and Knowledge Technologies. ACM ICPS, June 2012. [ bib | .pdf | Abstract ]
[1394] Mohammad J. Taghizadeh, Philip N. Garner, and Hervé Bourlard. Microphone array beampattern characterization for hands-free speech applications. In IEEE 7th Sensor Array and Multichannel Signal Processing Workshop(SAM), pages 473–476, June 2012. [ bib | .pdf | Abstract ]
[1395] Jagannadan Varadarajan, Remi Emonet, and Jean-Marc Odobez. Bridging the past, present and future: Modeling scene activities from event relationships and global rules. In IEEE Conference on Computer Vision and Pattern Recognition, 2012, Providence, Rhode Island, USA, June 2012. [ bib | Abstract ]
[1396] Kenneth Funes and Jean-Marc Odobez. Gaze estimation from multimodal kinect data. In IEEE Conference in Computer Vision and Pattern Recognition, Workshop on Gesture Recognition, June 2012. [ bib | .pdf | Abstract ]
[1397] Katayoun Farrahi and Daniel Gatica-Perez. Extracting mobile behavioral patterns with the distant n-gram topic model. In Proceedings of the IEEE International Symposium on Wearable Computers, June 2012. [ bib | .pdf | Abstract ]
[1398] Majid Yazdani and Andrei Popescu-Belis. Computing text semantic relatedness using the contents and links of a hypertext encyclopedia. Artificial Intelligence Journal, June 2012. [ bib | .pdf ]
[1399] Jagannadan Varadarajan, Remi Emonet, and Jean-Marc Odobez. Bridging the past, present and future: Modeling scene activities from event relationships and global rules. Idiap-RR Idiap-Internal-RR-71-2011, Idiap, June 2012. [ bib | .pdf ]
[1400] P. Korshunov, C. Araimo, F. De Simone, C. Velardo, J. L. Dugelay, and T. Ebrahimi. Evaluation of Visual Privacy Filters Impact on Video Surveillance Intelligibility. In Proceedings of the 4th International Workshop on Quality of Multimedia Experience, pages 150–151, July 2012. [ bib ]
[1401] P. Hanhart, F. De Simone, and T. Ebrahimi. Quality Assessment of Asymmetric Stereo Pair Formed From Decoded and Synthesized Views. In Proceedings of the 4th International Workshop on Quality of Multimedia Experience, pages 236–241, July 2012. [ bib ]
[1402] F. De Simone. Selected Contributions on Multimedia Quality Evaluation. PhD thesis, EPFL, Lausanne, July 2012. [ bib ]
[1403] Chris McCool, Sébastien Marcel, Abdenour Hadid, Matti Pietikainen, Pavel Matejka, Jan Cernocky, Norman Poh, J. Kittler, Anthony Larcher, Christophe Levy, Driss Matrouf, Jean-François Bonastre, Phil Tresadern, and Timothy Cootes. Bi-modal person recognition on a mobile phone: using mobile phone data. In IEEE ICME Workshop on Hot Topics in Mobile Multimedia, July 2012. [ bib | .pdf | Abstract ]
[1404] Nikolaos Pappas, Georgios Katsimpras, and Efstathios Stamatatos. An agent-based focused crawling framework for topic- and genre-related web document discovery. In 24th IEEE International Conference on Tools with Artificial Intelligence. IEEE, August 2012. [ bib | http | .pdf | Abstract ]
[1405] Arjan Gijsberts and Giorgio Metta. Real-time model learning using incremental sparse spectrum gaussian process regression. Neural Networks, August 2012. [ bib ]
[1406] Hisham Mohamed and Stephane Marchand-Maillet. Parallel approaches to permutation-based indexing using inverted files. In 5th International Conference on Similarity Search and Applications (SISAP), Toronto, CA, August 2012. [ bib ]
[1407] P. Hanhart, M. Rerabek, F. De Simone, and T. Ebrahimi. Subjective Quality Evaluation of the Upcoming HEVC Video Compression Standard. In Proceedings of SPIE, volume 8499 of Applications of Digital Image Processing XXXV, August 2012. [ bib ]
[1408] Youssef Oualil, Friedrich Faubel, Mathew Magimai.-Doss, and Dietrich Klakow. A tdoa gaussian mixture model for improving acoustic source tracking. In Youssef Oualil, editor, 20th European Signal Processing Conference, August 2012. [ bib | .pdf | Abstract ]
[1409] Manuel Günther, Dennis Haufe, and Rolf P. Würtz. Face recognition with disparity corrected Gabor phase differences. In Alessandro E. P. Villa, Wlodzislaw Duch, Péter érdi, Francesco Masulli, and Günther Palm, editors, Artificial Neural Networks and Machine Learning, volume 7552 of Lecture Notes in Computer Science, pages 411–418. Springer Berlin, September 2012. [ bib | DOI | .pdf | Abstract ]
[1410] Laurent El Shafey, Roy Wallace, and Sébastien Marcel. Face verification using gabor filtering and adapted gaussian mixture models. In Proceedings of the 11th International Conference of the Biometrics Special Interest Group, pages 397–408. GI-Edition, September 2012. [ bib | .pdf | Abstract ]
[1411] Ivana Chingovska, André Anjos, and Sébastien Marcel. On the effectiveness of local binary patterns in face anti-spoofing. In Proceedings of the 11th International Conference of the Biometrics Special Interes Group, September 2012. [ bib | .pdf | Abstract ]
[1412] Ke Sun, Eric Bruno, and Stephane Marchand-Maillet. Unsupervised skeleton learning for manifold denoising and outlier detection. In International Conference on Pattern Recognition (ICPR'2012), Tsukuba, JP, September 2012. [ bib ]
[1413] Ke Sun, Eric Bruno, and Stephane Marchand-Maillet. Stochastic unfolding. In IEEE Machine Learning for Signal Processing Workshop (MLSP'2012), Santander, Spain, September 2012. [ bib ]
[1414] Marc von Wyl, Birgit Hofreiter, and Stephane Marchand-Maillet. Serendipitous exploration of large-scale product catalogs. In 14th IEEE International Conference on Commerce and Enterprise Computing (CEC 2012), Hangzhou, CN, September 2012. [ bib ]
[1415] P. Korshunov, C. Araimo, F. De Simone, C. Velardo, J. L. Carmelo, and T. Ebrahimi. Subjective Study of Privacy Filters in Video Surveillance. In Proceedings of the IEEE International Workshop on Multimedia Signal Processing, September 2012. [ bib ]
[1416] Gwénolé Lecorvé, John Dines, Thomas Hain, and Petr Motlicek. Supervised and unsupervised web-based language model domain adaptation. In Proceedings of Interspeech, page to appear, September 2012. [ bib | .pdf | Abstract ]
[1417] Gwénolé Lecorvé and Petr Motlicek. Conversion of recurrent neural network language models to weighted finite state transducers for automatic speech recognition. In Proceedings of Interspeech, page to appear, September 2012. [ bib | .pdf | Abstract ]
[1418] Youssef Oualil, Mathew Magimai.-Doss, Friedrich Faubel, and Dietrich Klakow. Joint detection and localization of multiple speakers using a probabilistic interpretation of the steered response power. In Youssef Oualil, editor, Statistical and Perceptual Audition Workshop, September 2012. [ bib | .pdf | Abstract ]
[1419] Youssef Oualil, Friedrich Faubel, and Dietrich Klakow. A multiple hypothesis gaussian mixture filter for acoustic source localization and tracking. In Youssef Oualil, editor, 13th International Workshop on Acoustic Signal Enhancement, pages 233–236, September 2012. [ bib | .pdf | Abstract ]
[1420] Hong Lu, Mashfiqui Rabbi, Gokul Chittaranjan, Denise Frauendorfer, Marianne Schmid Mast, Andrew T. Campbell, Daniel Gatica-Perez, and Tanzeem Choudhury. Stresssense: Detecting stress in unconstrained acoustic environments using smartphones. In Ubicomp'12, September 2012. [ bib | .pdf ]
[1421] Weifeng Li and Hervé Bourlard. Sub-band based log-energy and its dynamic range stretching for robust in-car speech recognition. In Proceedings of the 13th Annual Conference of the International Speech Communication Association (InterSpeech), September 2012. [ bib | .pdf ]
[1422] Anindya Roy, Mathew Magimai.-Doss, and Sébastien Marcel. Boosting localized binary features for speech recognition. In Symposium on Machine Learning in Speech and Language Processing (MLSLP), September 2012. [ bib | .pdf ]
[1423] Serena Soldo, Mathew Magimai.-Doss, and Hervé Bourlard. Template-based asr using posterior features and synthetic references: comparing different tts systems. In SAPA-SCALE Conference, International Speech Communication Association, September 2012. [ bib | .pdf | Abstract ]
[1424] Serena Soldo, Mathew Magimai.-Doss, and Hervé Bourlard. Synthetic references for template-based asr using posterior features. In Proceedings of Interspeech, September 2012. [ bib | .pdf | Abstract ]
[1425] Arthur Kantor, Milos Cernak, Jiri Havelka, Sean Huber, Jan Kleindienst, and Doris B. Gonzalez. Reading companion: The technical and social design of an automated reading tutor. In Workshop on Child, Computer and Interaction, September 2012. [ bib | .pdf | Abstract ]
[1426] David Imseng, John Dines, Petr Motlicek, Philip N. Garner, and Hervé Bourlard. Comparing different acoustic modeling techniques for multilingual boosting. In Proceedings of Interspeech, page to appear, September 2012. [ bib | .pdf ]
[1427] Marco Fornoni and Barbara Caputo. Indoor scene recognition using task and saliency-driven feature pooling. In Proceedings of the British Machine Vision Conference, September 2012. [ bib | .pdf | Abstract ]
[1428] Trinh-Minh-Tri Do and Daniel Gatica-Perez. Contextual conditional models for smartphone-based human mobility prediction. In Proceedings of the 14th ACM International Conference on Ubiquitous Computing, September 2012. [ bib | .pdf | Abstract ]
[1429] Milos Cernak, David Imseng, and Hervé Bourlard. Robust triphone mapping for acoustic modeling. In Proceedings of Interspeech, page to appear, September 2012. [ bib | .pdf | Abstract ]
[1430] Afsaneh Asaei, Bhiksha Raj, Hervé Bourlard, and Volkan Cevher. Structured sparse coding for microphone array position calibration. In SAPA-SCALE Conference, International Speech Communication Association, September 2012. [ bib | Abstract ]
[1431] Shajith Ikbal, Hemant Misra, Hynek Hermansky, and Mathew Magimai.-Doss. Phase autocorrelation (pac) features for noise robust speech recognition. Speech Communication, 54(7):867�880, September 2012. [ bib | DOI ]
[1432] Samira Sheikhi and Jean-Marc Odobez. Investigating the midline effect for visual focus of attention recognition. In Int Conf. on Multimodal Interaction (ICMI), Santa Monica, October 2012. [ bib | .pdf ]
[1433] Samira Sheikhi, Vasil Khalidov, and Jean-Marc Odobez. Recognizing the visual focus of attention for human robot interaction. In IEEE International Conference on Intelligent Robots and Systems (IROS) - Human Behavior Understanding Workshop(IROS-HBU), October 2012. [ bib | .pdf ]
[1434] Jean-Marc Odobez, C. Carincotte, Remi Emonet, E. Jouneau, Sofia Zaidenberg, Bertrand Raverra, Francois Bremond, and Andrea Grifoni. Unsupervised activity analysis and monitoring algorithms for effective surveillance systems. In European Conference on Computer Vision, LNCS, October 2012. [ bib | .pdf ]
[1435] Najeh Hajlaoui and Andrei Popescu-Belis. Translating english discourse connectives into arabic: a corpus-based analysis and an evaluation metric. In Fourth Workshop on Computational Approaches to Arabic Script-based Languages at Proceedings of the Tenth Biennial Conference of the Association for Machine Translation in the Americas (AMTA), October 2012. [ bib | .pdf | Abstract ]
[1436] Manuel Günther, Roy Wallace, and Sébastien Marcel. An open source framework for standardized comparisons of face recognition algorithms. In Andrea Fusiello, Vittorio Murino, and Rita Cucchiara, editors, Computer Vision - ECCV 2012. Workshops and Demonstrations, volume 7585 of Lecture Notes in Computer Science, pages 547–556, Rue Marconi 19, CH - 1920 Martigny, Switzerland, October 2012. Idiap Research Institute, Springer Berlin. [ bib | DOI | .pdf | Abstract ]
[1437] André Anjos, Laurent El Shafey, Roy Wallace, Manuel Günther, Chris McCool, and Sébastien Marcel. Bob: a free signal processing and machine learning toolbox for researchers. In Proceedings of the ACM Multimedia Conference, October 2012. [ bib | http | .pdf | Abstract ]
[1438] Thomas Meyer, Andrei Popescu-Belis, Najeh Hajlaoui, and Andrea Gesmundo. Machine translation of labeled discourse connectives. In Proceedings of the Tenth Biennial Conference of the Association for Machine Translation in the Americas (AMTA), October 2012. [ bib ]
[1439] Nicolae Suditu and Francois Fleuret. Iterative relevance feedback with adaptive exploration/exploitation trade-off. In Proceedings of the 21st ACM Conference on Information and Knowledge Management, page 9, October 2012. [ bib | .pdf | Abstract ]
[1440] Petr Motlicek, Laurent El Shafey, Roy Wallace, Chris McCool, and Sé bastien Marcel. Bi-modal authentication in mobile environments using session variability modelling. In Proceedings of the 21st International Conference on Pattern Recognition, November 2012. [ bib | .pdf | Abstract ]
[1441] Tiago de Freitas Pereira, André Anjos, José Mario De Martino, and S ébastien Marcel. Lbp-top based countermeasure against face spoofing attacks. In International Workshop on Computer Vision With Local Binary Pattern Variants - ACCV, page 12, November 2012. [ bib | .pdf | Abstract ]
[1442] I. Ivanov, P. Vajda, J. S. Lee, P. Korshunov, and T. Ebrahimi. Geotag Propagation with User Trust Modeling. In N. Ramzan, R. van Zwol, J. S. Lee, K. Clüuver, and X. S. Hua, editors, Social Media Retrieval, Computer Communications and Networks. Springer, November 2012. [ bib ]
[1443] Dairazalia Sanchez-Cortes, Petr Motlicek, and Daniel Gatica-Perez. Assessing the impact of language style on emergent leadership perception from ubiquitous audio. In Proceedings of the 11th International Conference on Mobile and Ubiquitous Multimedia, December 2012. [ bib | .pdf ]
[1444] Eric Malmi, Trinh-Minh-Tri Do, and Daniel Gatica-Perez. Checking in or checked in: Comparing large-scale manual and automatic location disclosure patterns. In Proceedings of the 11th International Conference on Mobile and Ubiquitous Multimedia, December 2012. [ bib ]
[1445] Elie Khoury, Laurent El Shafey, and Sébastien Marcel. The idiap speaker recognition evaluation system at nist sre 2012. In NIST Speaker Recognition Conference. NIST, December 2012. [ bib | .pdf | Abstract ]
[1446] David Imseng, Hervé Bourlard, Holger Caesar, Philip N. Garner, Gwénolé Lecorv é, and Alexandre Nanchen. Mediaparl: Bilingual mixed language accented speech database. In Proceedings of the 2012 IEEE Workshop on Spoken Language Technology, pages 263–268, December 2012. [ bib | .pdf ]
[1447] Cong-Thanh Do, Mohammad J. Taghizadeh, and Philip N. Garner. Combining cepstral normalization and cochlear implant-like speech processing for microphone array-based speech recognition. In Proceedings of the IEEE Workshop on Spoken Language Technology, December 2012. [ bib | .pdf | Abstract ]
[1448] Katayoun Farrahi and Daniel Gatica-Perez. A probabilistic approach to mining mobile phone data sequences. Personal and Ubiquitous Computing, December 2012. [ bib | .pdf | Abstract ]
[1449] Stefan Duffner, Petr Motlicek, and Danil Korchagin. The ta2 database � a multi-modal database from home entertainment. International Journal of Computer and Electrical Engineering, 4(5):670–673, December 2012. [ bib | http | .pdf | Abstract ]
[1450] Trinh-Minh-Tri Do and Thierry Artieres. Regularized bundle methods for convex and non-convex risks. Journal of Machine Learning Research, 13:3539–3583, December 2012. [ bib | .pdf | Abstract ]
[1451] Rahim Saedi, Kong Aik Lee, Tomi Kinnunen, Tawfik Hasan, Benoit Fauve, Pierre-Michel Bousquet, Elie Khoury, Pablo Luis Sordo Martinez, Jia Min Karen Kua, Changhuai You, Hanwu Sun, Anthony Larcher, Padmanabhan Rajan, Ville Hautamüaki, Cemal Hanilci, Billy Braithwaite, Gonzalez-Hautamüaki Rosa, Seyed Omid Sadjadi, Gang Liu, Hynek Boril, Navid Shokouhi, Driss Matrouf, Laurent El Shafey, Pejman Mowlaee, Julien Epps, Tharmarajah Thiruvaran, David Van Leeuwen, Bin Ma, Haizhou Li, John Hansen, Jean-François Bonastre, Sébastien Marcel, John Mason, and Eliathamby Ambikairajah. I4u submission to nist sre 2012: a large-scale collaborative effort for noise-robust speaker verification. Idiap-RR Idiap-Internal-RR-75-2012, Idiap, December 2012. [ bib | .pdf | Abstract ]
[1452] Kong Aik Lee, Rahim Saedi, Tawfik Hasan, Tomi Kinnunen, Benoit Fauve, Pierre-Michel Bousquet, Elie Khoury, Pablo Luis Sordo Martinez, Tharmarajah Thiruvaran, Changhuai You, Padmanabhan Rajan, David Van Leeuwen, Seyed Omid Sadjadi, Driss Matrouf, Laurent El Shafey, John Mason, Eliathamby Ambikairajah, Hanwu Sun, Anthony Larcher, Bin Ma, Haizhou Li, Ville Hautamüaki, Cemal Hanilci, Billy Braithwaite, Gonzalez-Hautamüaki Rosa, Gang Liu, Hynek Boril, Navid Shokouhi, John Hansen, Jean-François Bonastre, and Sébastien Marcel. The i4u submission to the 2012 nist speaker recognition evaluation. Idiap-RR Idiap-Internal-RR-74-2012, Idiap, December 2012. [ bib | .pdf ]
[1453] Tiago de Freitas Pereira, Jukka Komulainen, André Anjos, José Mario De Martino, Abdenour Hadid, Matti Pietikainen, and Sébastien Marcel. Face liveness using dynamic texture. Idiap-RR Idiap-Internal-RR-79-2012, Idiap, December 2012. Submitted to EURASIP. [ bib | http | .pdf | Abstract ]
[1454] Sree Harsha Yella and Hervé Bourlard. Improved overlap speech diarization of meeting recordings using long-term conversational features. In ICASSP, 2013. [ bib | .pdf ]
[1455] Majid Yazdani, Ronan Collobert, and Andrei Popescu-Belis. Learning to rank on network data. In Mining and Learning with Graphs, 2013. [ bib | .pdf ]
[1456] Majid Yazdani and Andrei Popescu-Belis. Computing text semantic relatedness using the contents and links of a hypertext encyclopedia. In International Joint Conference on artificial intelligence, 2013. [ bib | .pdf ]
[1457] Romain Tavenard, Remi Emonet, and Jean-Marc Odobez. Time-sensitive topic models for action recognition in videos. In IEEE International Conference on Image Processing, 2013. [ bib | .pdf ]
[1458] Raphael Sznitman, Carlos Becker, Francois Fleuret, and Pascal Fua. Fast object detection with entropy-driven evaluation. In Proceedings of the Conference on Computer Vision and Pattern Recognition, 2013. [ bib ]
[1459] Gyorgy Szaszak and Andras Beke. Using phonological phrase segmentation to improve automatic keyword spotting for the highly agglutinating hungarian language. In Proc. of Interspeech 2013, 2013. [ bib ]
[1460] A. Sapru and Hervé Bourlard. Automatic social role recognition in professional meetings using conditional random fields. In Proceedings of Interspeech, 2013. [ bib | .pdf ]
[1461] Ramya Rasipuram and Mathew Magimai.-Doss. Improving grapheme-based asr by probabilistic lexical modeling approach. In Proceedings of Interspeech, 2013. [ bib | .pdf | Abstract ]
[1462] Ramya Rasipuram, Peter Bell, and Mathew Magimai.-Doss. Grapheme and multilingual posterior features for under-resourced speech recognition: A study on scottish gaelic. In IEEE International Conference on Acoustics, Speech and Signal Processing, 2013. [ bib | .pdf | Abstract ]
[1463] Nikolaos Pappas and Andrei Popescu-Belis. Sentiment analysis of user comments for one-class collaborative filtering over ted talks. In 36th ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 2013. [ bib | .pdf | Abstract ]
[1464] Nikolaos Pappas, Georgios Katsimpras, and Efstathios Stamatatos. Distinguishing the popularity between topics: A system for up-to-date opinion retrieval and mining in the web. In 14th International Conference on Intelligent Text Processing and Computational Linguistics. LNCS, ACM, 2013. [ bib | http | .pdf | Abstract ]
[1465] Nikolaos Pappas and Andrei Popescu-Belis. Combining content with user preferences for ted lecture recommendation. In 11th International Workshop on Content Based Multimedia Indexing. IEEE, 2013. [ bib | .pdf | Abstract ]
[1466] Gelareh Mohammadi, Sunghyun Park, Kenji Sagae, Alessandro Vinciarelli, and Louis-Philippe Morency. Who is persuasive? the role of perceived personality and communication modality in social multimedia. In International Conference on Multimodal Interaction, 2013. [ bib ]
[1467] Jesus Martinez-Gomez, Ismael Garcia-Varea, Miguel Cazorla, and Barbara Caputo. Overview of the imageclef 2013 robot vision task. In Working Notes, CLEF 2013, 2013. [ bib | .pdf ]
[1468] Dinesh Babu Jayagopi, Samira Sheikhi, David Klotz, Johannes Wienke, Jean-Marc Odobez, Sebastian Wrede, Vasil Khalidov, Laurent Son Nguyen, Britta Wrede, and Daniel Gatica-Perez. The vernissage corpus: a conversational human-robot-interaction dataset. In Proceedings of the 8th ACM/IEEE international conference on Human-robot interaction, 2013. [ bib | .pdf ]
[1469] Dinesh Babu Jayagopi and Jean-Marc Odobez. Given that, should i respond? contextual addressee estimation in multi-party human-robot interactions. In Proceedings of Human Robot Interaction (HRI) Conference, 2013. [ bib | .pdf | Abstract ]
[1470] David Imseng and Hervé Bourlard. Speaker adaptive kullback-leibler divergence based hidden markov models. In Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing, 2013. [ bib | .pdf ]
[1471] Alexandre Heili and Jean-Marc Odobez. Parameter estimation and contextual adaptation for a multi-object tracking crf model. In IEEE Workshop on Performance Evaluation of Tracking and Surveillance, 2013. [ bib | .pdf ]
[1472] Maryam Habibi and Andrei Popescu-Belis. Diverse keyword extraction from conversations. In Proceedings of the ACL 2013 (51th Annual Meeting of the Association for Computational Linguistics ), Short Papers, 2013. [ bib | .pdf | Abstract ]
[1473] Arjan Gijsberts and Barbara Caputo. Exploiting accelerometers to improve movement classification for prosthetics. In International Conference on Rehabilitation Robotics, 2013. [ bib | .pdf ]
[1474] Charles Dubout and Francois Fleuret. Accelerated training of linear object detectors. In CVPR 2013 Workshop on Structured Prediction, 2013. [ bib | www: | .pdf ]
[1475] Charles Dubout and Francois Fleuret. Deformable part models with individual part scaling. In British Machine Vision Conference, 2013. [ bib ]
[1476] Alice Aubert, Romain Tavenard, Simon Malinowski, Thomas Guyet, Ren é Quiniou, Jean-Marc Odobez, Remi Emonet, and Chantal Gascuel. Discovering temporal patterns in water quality time series, focusing on floods with the lda method. In European Geosciences Union, 2013. [ bib | .pdf ]
[1477] Afsaneh Asaei, Bhiksha Raj, Hervé Bourlard, and Volkan Cevher. A multipath sparse beamfroming method. In Signal Processing with Adaptive Sparse Structured Representations SPARS, 2013. [ bib | Abstract ]
[1478] Afsaneh Asaei, Mohammad Golbabaee, Hervé Bourlard, and Volkan Cevher. Room acoustic modeling exploiting joint sparsity and low-rank structures. In Signal Processing with Adaptive Sparse Structured Representations SPARS, 2013. [ bib | Abstract ]
[1479] Fabian Nater, Tatiana Tommasi, Luc Van Gool, and Barbara Caputo. Learning to learn new models of human activities in indoor settings1. In Hervé Bourlard and Andrei Popescu-Belis, editors, Interactive Multimodal Information Management. EPFL Press, 2013. [ bib | .pdf ]
[1480] Barbara Caputo. Medical image annotation. In Hervé Bourlard and Andrei Popescu-Belis, editors, Interactive Multimodal Information Management. EPFL Press, 2013. [ bib | .pdf ]
[1481] Bjüorn Schuller, Stefan Steidl, Anton Batliner, Elmar Nüoth, Alessandro Vinciarelli, Felix Burkhardt, felix Weninger, Florian Eyben, Tobias Bocklet, Gelareh Mohammadi, and Benjamin Weiss. A survey on perceived speaker traits: Personality, likability, pathology and the first challenge. Computer Speech and Language, 2013. [ bib ]
[1482] Chris McCool, Roy Wallace, Mitchell McLaren, Laurent El Shafey, and Sé bastien Marcel. Session variability modelling for face authentication. IET Biometrics, 2013. [ bib | .pdf | Abstract ]
[1483] Riwal Lefort and Francois Fleuret. treekl: A distance between high dimension empirical distributions. Pattern Recognition Letters, 34(2):140–145, 2013. [ bib | .pdf ]
[1484] David Imseng, Hervé Bourlard, John Dines, Philip N. Garner, and Mathew Magimai.-Doss. Applying multi- and cross-lingual stochastic phone space transformations to non-native speech recognition. IEEE Transactions on Audio, Speech, and Language Processing, 2013. [ bib | Abstract ]
[1485] David Imseng, Petr Motlicek, Hervé Bourlard, and Philip N. Garner. Using out-of-language data to improve an under-resourced speech recognizer. Speech Communication, 2013. [ bib | DOI | http | .pdf | Abstract ]
[1486] Trinh-Minh-Tri Do and Daniel Gatica-Perez. The places of our lives: Visiting patterns and automatic labeling from longitudinal smartphone data. IEEE Transactions on Mobile Computing, 2013. [ bib | .pdf | Abstract ]
[1487] Tatiana Tommasi, Francesco Orabona, and Barbara Caputo. Learning categories from few examples with multi model knowledge transfer. Idiap-RR Idiap-RR-16-2013, Idiap, 2013. [ bib | .pdf | Abstract ]
[1488] Gyorgy Szaszak, Milos Cernak, Philip N. Garner, Petr Motlicek, Alexandre Nanchen, and Flavio Tarsetti. Automatic speech indexing system of bilingual video parliament interventions. Idiap-RR Idiap-RR-25-2013, Idiap, 2013. [ bib | .pdf | Abstract ]
[1489] Gyorgy Szaszak and Andras Beke. Using phonological phrase segmentation to improve automatic keyword spotting for the highly agglutinating hungarian language. Idiap-RR Idiap-RR-23-2013, Idiap, 2013. [ bib | .pdf | Abstract ]
[1490] Gyorgy Szaszak. Adaptation experiments on french mediaparl asr. Idiap-RR Idiap-RR-10-2013, Idiap, 2013. [ bib | .pdf | Abstract ]
[1491] Ramya Rasipuram and Mathew Magimai.-Doss. Probabilistic lexical modeling and grapheme-based automatic speech recognition. Idiap-RR Idiap-RR-15-2013, Idiap, 2013. Submitted to Speech Communication. [ bib | .pdf | Abstract ]
[1492] Ramya Rasipuram and Mathew Magimai.-Doss. Improving grapheme-based asr by probabilistic lexical modeling approach. Idiap-RR Idiap-RR-14-2013, Idiap, 2013. [ bib | .pdf | Abstract ]
[1493] Ramya Rasipuram and Mathew Magimai.-Doss. Kl-hmm and probabilistic lexical modeling. Idiap-RR Idiap-RR-04-2013, Idiap, 2013. [ bib | .pdf | Abstract ]
[1494] Pedro H. O. Pinheiro and Ronan Collobert. Recurrent convolutional neural networks for scene parsing. Idiap-RR Idiap-RR-22-2013, Idiap, 2013. [ bib | .pdf | Abstract ]
[1495] Dimitri Palaz, Ronan Collobert, and Mathew Magimai.-Doss. Estimating phoneme class conditional probabilities from raw speech signal using convolutional neural networks. Idiap-RR Idiap-RR-13-2013, Idiap, 2013. [ bib | .pdf | Abstract ]
[1496] Xingyu Na and Philip N. Garner. Convolutional pitch target approximation model for speech synthesis. Idiap-RR Idiap-RR-05-2013, Idiap, 2013. [ bib | .pdf | Abstract ]
[1497] Chris McCool, Roy Wallace, Mitchell McLaren, Laurent El Shafey, and Sébastien Marcel. Session variability modelling for face authentication. Idiap-RR Idiap-RR-17-2013, Idiap, 2013. [ bib | .pdf | Abstract ]
[1498] Hui Liang and John Dines. Enhancing state mapping-based cross-lingual speaker adaptation using phonological knowledge in a data-driven manner. Idiap-RR Idiap-RR-08-2013, Idiap, 2013. [ bib | .pdf | Abstract ]
[1499] David Imseng, Petr Motlicek, Hervé Bourlard, and Philip N. Garner. Using out-of-language data to improve an under-resourced speech recognizer. Idiap-RR Idiap-RR-09-2013, Idiap, 2013. [ bib | .pdf | Abstract ]
[1500] David Imseng, Hervé Bourlard, Holger Caesar, Philip N. Garner, Gwénolé Lecorvé, and Alexandre Nanchen. Mediaparl: Bilingual mixed language accented speech database. Idiap-RR Idiap-RR-03-2013, Idiap, 2013. [ bib | .pdf ]
[1501] David Imseng, John Dines, Petr Motlicek, Philip N. Garner, and Hervé Bourlard. Comparing different acoustic modeling techniques for multilingual boosting. Idiap-RR Idiap-RR-01-2013, Idiap, 2013. [ bib | .pdf | Abstract ]
[1502] Philip N. Garner and David Imseng. Statistical models for hmm/ann hybrids. Idiap-RR Idiap-RR-11-2013, Idiap, 2013. [ bib | .pdf | Abstract ]
[1503] Remi Emonet and Jean-Marc Odobez. Unsupervised methods for activity analysis and detection of abnormal events. Idiap-RR Idiap-RR-21-2013, Idiap, 2013. [ bib | .pdf ]
[1504] Remi Emonet and Jean-Marc Odobez. Analyse non supervisée d'activités en vidéo surveillance pour l'analyse de scène et la détection d'événements anormaux. Idiap-RR Idiap-RR-20-2013, Idiap, 2013. [ bib | http | .pdf ]
[1505] Laurent El Shafey, Chris McCool, Roy Wallace, and Sébastien Marcel. A scalable formulation of probabilistic linear discriminant analysis: Applied to face recognition. Idiap-RR Idiap-RR-07-2013, Idiap, 2013. Accepted for publication. [ bib | .pdf | Abstract ]
[1506] Tiago de Freitas Pereira, André Anjos, José Mario De Martino, and Sébastien Marcel. Can face anti-spoofing countermeasures work in a real world scenario? Idiap-RR Idiap-Internal-RR-03-2013, Idiap, January 2013. Submitted to the International Conference on Biometrics (ICB'13). [ bib | .pdf | Abstract ]
[1507] Ivana Chingovska, André Anjos, and Sébastien Marcel. The 2nd competition on counter measures to 2d face spoofing attacks. Idiap-RR Idiap-RR-18-2013, Idiap, 2013. [ bib | .pdf | Abstract ]
[1508] Milos Cernak, Xingyu Na, and Philip N. Garner. Syllable-based pitch encoding for low bit rate speech coding with recognition/synthesis architecture. Idiap-RR Idiap-RR-24-2013, Idiap, 2013. [ bib | .pdf | Abstract ]
[1509] Milos Cernak, Petr Motlicek, and Philip N. Garner. On the (un)importance of the contextual factors in hmm-based speech synthesis and coding. Idiap-RR Idiap-RR-06-2013, Idiap, 2013. [ bib | .pdf ]
[1510] Raphael Ullmann, Hervé Bourlard, Jens Berger, and Anna Llagostera Casanovas. Noise intrusiveness factors in speech telecommunications. In Proceedings of the AIA-DAGA 2013 International Conference on Acoustics, pages 436–439, March 2013. [ bib | .pdf ]
[1511] Najeh Hajlaoui and Andrei Popescu-Belis. Assessing the accuracy of discourse connective translations: Validation of an automatic metric. In 14th International Conference on Intelligent Text Processing and Computational Linguistics, page 12. University of the Aegean, Springer, March 2013. [ bib | .pdf | Abstract ]
[1512] Stefan Duffner and Jean-Marc Odobez. A track creation and deletion framework for long-term online multi-face tracking. IEEE Transactions on Image Processing, March 2013. [ bib | .pdf ]
[1513] Alvaro Marcos-Ramiro, Daniel Pizarro-Perez, Marta Marron-Romera, Laurent Son Nguyen, and Daniel Gatica-Perez. Body communicative cue extraction for conversational analysis. In Proceedings of IEEE International Conference on Automatic Face and Gesture Recognition, April 2013. [ bib | .pdf ]
[1514] Elie Khoury, Paul Gay, and Jean-Marc Odobez. Fusing matching and biometric similarity measures for face diarization in video. In Proceedings of the 3rd ACM conference on International conference on multimedia retrieval, pages 97–104. ACM, April 2013. [ bib | .pdf | Abstract ]
[1515] Bruno Cartoni, Sandrine Zufferey, and Thomas Meyer. Annotating the meaning of discourse connectives by looking at their translation: The translation-spotting technique. Dialogue & Discourse, 4(2):65–86, April 2013. [ bib | DOI | .pdf | Abstract ]
[1516] Milos Cernak, Petr Motlicek, and Philip N. Garner. On the (un)importance of the contextual factors in hmm-based speech synthesis. In Proceedings of the IEEE Intl. Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 8140–8143, May 2013. [ bib | .pdf ]
[1517] Trinh-Minh-Tri Do and Daniel Gatica-Perez. Where and what: Using smartphones to predict next locations and applications in daily life. Pervasive and Mobile Computing, May 2013. [ bib | .pdf | Abstract ]
[1518] Edgar Roman-Rangel, Jean-Marc Odobez, and Daniel Gatica-Perez. Evaluating shape descriptors for detection of maya hieroglyphs. In in Proc. Mexican Conf. on Pattern Recognition, June 2013. [ bib | .pdf ]
[1519] Thomas Meyer and Bonnie Webber. Implicitation of discourse connectives in (machine) translation. In Proceedings of the 1st DiscoMT Workshop at ACL 2013 (51th Annual Meeting of the Association for Computational Linguistics), pages 19–26, June 2013. [ bib | .pdf | Abstract ]
[1520] Thomas Meyer, Cristina Grisot, and Andrei Popescu-Belis. Detecting narrativity to improve english to french translation of simple past verbs. In Proceedings of the 1st DiscoMT Workshop at ACL 2013 (51th Annual Meeting of the Association for Computational Linguistics), pages 33–42, June 2013. [ bib | .pdf | Abstract ]
[1521] Thomas Meyer and Lucie Polakova. Machine translation with many manually labeled discourse connectives. In Proceedings of the 1st DiscoMT Workshop at ACL 2013 (51th Annual Meeting of the Association for Computational Linguistics), pages 43–50, June 2013. [ bib | .pdf | Abstract ]
[1522] Ilja Kuzborskij and Francesco Orabona. Stability and hypothesis transfer learning. In International Conference on Machine Learning, June 2013. [ bib | .pdf | Abstract ]
[1523] Ilja Kuzborskij, Francesco Orabona, and Barbara Caputo. From n to n 1: Multiclass transfer incremental learning. In Proceedings of the Conference on Computer Vision and Pattern Recognition, June 2013. [ bib | .pdf | Abstract ]
[1524] Jukka Komulainen, Abdenour Hadid, Matti Pietikainen, André Anjos, and Sébastien Marcel. Complementary countermeasures for detecting scenic face spoofing attacks. In International Conference on Biometrics, June 2013. [ bib | http | .pdf | Abstract ]
[1525] Elie Khoury, Bostjan Vesnicer, Javier Franco-Pedroso, Ricardo Violato, Zenelabidine Boulkenafet, Luis-Miguel Mazaira Fernandez, Mireia Diez, Justina Kosmala, Houssemeddine Khemiri, Tomas Cipr, Rahim Saedi, Manuel Günther, Jerneja Zganec-Gros, Ruben Zazo Candil, Flávio Simões, Messaoud Bengherabi, Augustin Alvarez Marquina, Mikel Penagarikano, Alberto Abad, Mehdi Boulayemen, Petr Schwarz, David Van Leeuwen, Javier Gonzalez-Dominguez, Mário Uliani Neto, Elhocine Boutellaa, Pedro Gomez Vilda, Amparo Varona, Dijana Petrovska-Delacretaz, Pavel Matejka, Joaquin Gonzalez-Rodriguez, Tiago de Freitas Pereira, Farid Harizi, Luis Javier Rodriguez-Fuentes, Laurent El Shafey, Marcus Angeloni, German Bordel, Gérard Chollet, and Sébastien Marcel. The 2013 speaker recognition evaluation in mobile environment. In The 6th IAPR International Conference on Biometrics, June 2013. [ bib | .pdf ]
[1526] Manuel Günther, Artur Costa-Pazo, Changxing Ding, Elhocine Boutellaa, Giovani Chiachia, Honglei Zhang, Marcus de Assis Angeloni, Vitomir Struc, Elie Khoury, Esteban Vazquez-Fernandez, Dacheng Tao, Messaoud Bengherabi, David Cox, Serkan Kiranyaz, Tiago de Freitas Pereira, Jerneja Zganec-Gros, Enrique Argones-Rúa, Nicolas Pinto, Moncef Gabbouj, Flávio Simões, Simon Dobrisek, Daniel González-Jiménez, Anderson Rocha, Mário Uliani Neto, Nikola Pavesic, Alexandre Falcão, Ricardo Violato, and Sé bastien Marcel. The 2013 face recognition evaluation in mobile environment. In The 6th IAPR International Conference on Biometrics, June 2013. [ bib | .pdf ]
[1527] Tiago de Freitas Pereira, André Anjos, José Mario De Martino, and S ébastien Marcel. Can face anti-spoofing countermeasures work in a real world scenario? In International Conference on Biometrics, June 2013. [ bib | http | .pdf | Abstract ]
[1528] Ivana Chingovska, Jinwei Yang, Zhen Lei, Dong Yi, Stan Z.Li, Olga Küahm, Naser Damer, Christian Glaser, Arjan Kuijper, Alexander Nouak, Jukka Komulainen, Tiago de Freitas Pereira, Shubham Gupta, Shubham Bansal, Shubham Khandelwal, Ayush Rai, Tarun Krishna, Dushyant Goyal, Muhammad-Adeel Waris, Honglei Zhang, Iftikhar Ahmad, Serkan Kiranyaz, Moncef Gabbouj, Roberto Tronci, Maurizio Pili, Nicola Sirena, Fabio Roli, Javier Galbally, Julian Fierrez, Allan Pinto, Helio Pedrini, William Robson Schwartz, Anderson Rocha, André Anjos, and Sébastien Marcel. The 2nd competition on counter measures to 2d face spoofing attacks. In International Conference of Biometrics 2013, June 2013. [ bib | .pdf | Abstract ]
[1529] Ivana Chingovska, André Anjos, and Sébastien Marcel. Anti-spoofing in action: joint operation with a verification system. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Workshop on Biometrics, June 2013. [ bib | .pdf | Abstract ]
[1530] Mohammad J. Taghizadeh, Reza Parhizkar, Philip N. Garner, and Hervé Bourlard. Euclidean distance matrix completion for ad-hoc microphone array calibration. In Proceedings IEEE International Conference On Digital Signal Processing, July 2013. [ bib | .pdf | Abstract ]
[1531] Eric Malmi, Trinh-Minh-Tri Do, and Daniel Gatica-Perez. From foursquare to my square: Learning check-in behavior from multiple sources. In The 7th International AAAI Conference on Weblogs and Social Media, July 2013. [ bib | .pdf ]
[1532] Najeh Hajlaoui. Are act's scores increasing with better translation quality? In Are ACT's scores increasing with better translation quality?, page 6, July 2013. [ bib | .pdf | Abstract ]
[1533] Laurent El Shafey, Chris McCool, Roy Wallace, and Sébastien Marcel. A scalable formulation of probabilistic linear discriminant analysis: Applied to face recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(7):1788–1794, July 2013. [ bib | DOI | Abstract ]
[1534] Rahim Saedi, Kong Aik Lee, Tomi Kinnunen, Tawfik Hasan, Benoit Fauve, Pierre-Michel Bousquet, Elie Khoury, Pablo Luis Sordo Martinez, Jia Min Karen Kua, Changhuai You, Hanwu Sun, Anthony Larcher, Padmanabhan Rajan, Ville Hautamüaki, Cemal Hanilci, Billy Braithwaite, Gonzalez-Hautamüaki Rosa, Seyed Omid Sadjadi, Gang Liu, Hynek Boril, Navid Shokouhi, Driss Matrouf, Laurent El Shafey, Pejman Mowlaee, Julien Epps, Tharmarajah Thiruvaran, David Van Leeuwen, Bin Ma, Haizhou Li, Jean-François Bonastre, S ébastien Marcel, John Mason, and Eliathamby Ambikairajah. I4u submission to nist sre 2012: a large-scale collaborative effort for noise-robust speaker verification. In INTERSPEECH, August 2013. [ bib | Abstract ]
[1535] Mickael Rouvier, Gregor Dupuy, Paul Gay, Elie Khoury, Teva Merlin, and Sylvain Meignier. A free state-of-the-art toolbox for broadcast news diarization. In INTERSPEECH, August 2013. [ bib ]
[1536] Milos Cernak, Xingyu Na, and Philip N. Garner. Syllable-based pitch encoding for low bit rate speech coding with recognition/synthesis architecture. In Proc. of Interspeech 2013, August 2013. [ bib | .pdf ]
[1537] A. Sapru and Hervé Bourlard. Investigating the impact of language style and vocal expression on social roles of participants in professional meetings. In Affective Computing and Intelligent Interaction, page 6, September 2013. [ bib | .pdf ]
[1538] Kenneth Alberto Funes Mora and Jean-Marc Odobez. Person independent 3d gaze estimation from remote rgb-d cameras. In International Conference on Image Processing. IEEE, September 2013. [ bib | Abstract ]

This file was generated by bibtex2html 1.98.