Pattern recognition aims to automate the perceptive and cognitive capabilities of animals i.e. to perceive and to interprete the environment on the basis of visual, acoustic, haptic, olfactory, gustatory, or any other sensors. However, only the imitation of the cognitive abilities is intended, but the processing strategy may totally differ from animated examples. In order to focus our research we concentrate our activities in this field to the learning, classification and interpretation of visual and acoustic input in the context of human-robot-interaction. Examples are the adaptation of kernel particle filtering for the tracking of the human body pose, the extension of the anchoring approach towards a multimodal anchoring framework, the development of a structural framework for assembly modeling and recognition, and approaches for imitation learning for behavior acquisition in artificial agents. Although, mainly statistical methods are used we are also open-minded about structural approaches.