We describe our experiments for the Similar Segments in Social Speech Task at MediaEval 2013 Benchmark. We mainly focus on segmentation of the recordings into shorter passages on which we apply standard retrieval techniques.
We experiment with machine-learning-based segmentation employing textual (word n-grams, tag n-grams, letter cases, lexical cohesion, etc.) and prosodic features (silence) and compare the results with those obtained by regular segmentation.