Portrait of Sebastian Hofstätter

Sebastian Hofstätter

Information Retrieval ‚ÄĘ Dense Retrieval ‚ÄĘ Neural Re-Ranking

Hi, I am Sebastian, PhD student and research & teaching assistant at TU Wien supervised by Prof. Allan Hanbury. I work in the field of Information Retrieval on efficient & interpretable neural re-ranking and effective dense retrieval models.

My goal is to make neural techniques in IR accessible to a large audience. Therefore, I study and try to optimize the cost-effectiveness tradeoff from multiple angles, so that everyone can deploy those techniques. In addition, I created an award-winning master-level course to teach neural advances in IR – all of which is open-source ūüéČ

Find me on Twitter, GitHub, HuggingFace,
or email me: sebastian.hofstaetter@tuwien.ac.at

Research

In my PhD, I work on the topic of "Optimizing for the Cost-Effectiveness Tradeoff in Neural Retrieval and Re-Ranking" from multiple angles, split into two main parts:

Neural Retrieval & Knowledge Distillation

Dense retrieval (using a nearest neighbor vector search) gained quick popularity as a promising future of search – I am focused on utilizing knowledge distillation from stronger, but slower teacher models to improve the dense retrieval quality.

2021SIGIR
(Full)
Efficiently Teaching an Effective Dense Retriever with Balanced Topic Aware Sampling
S. Hofstätter, S.-C. Lin, J.-H. Yang, J. Lin, A. Hanbury
2020arXiv
(Full)
Improving Efficient Neural Ranking Models with Cross-Architecture Knowledge Distillation
S. Hofstätter, S. Althammer, M. Schröder, M. Sertkan, A. Hanbury
2019ECIR
(Short)
Enriching Word Embeddings for Patent Retrieval with Global Context
S. Hofstätter, N. Rekabsaz, M. Lupu, C. Eickhoff, A. Hanbury
‚≠ź Won best systems short paper award


Efficient & Interpretable Neural Re-Ranking

Neural re-ranking models always add time to the query latency, therefore I focus on improving the efficiency for short and long text neural re-ranking models.

2021SIGIR
(Full)
Intra-Document Cascading: Learning to Select Passages for Neural Document Ranking
S. Hofstätter, B. Mitra, H. Zamani, N. Craswell, A. Hanbury
2021ECIR
(Full)
Mitigating the Position Bias of Transformer Models in Passage Re-Ranking
S. Hofstätter, A. Lipani, S. Althammer, M. Zlabinger, A. Hanbury
2020TREC Evaluating Transformer-Kernel Models at TREC Deep Learning 2020
S. Hofstätter, A. Hanbury
2020SIGIR
(Short)
Local Self-Attention over Long Text for Efficient Document Retrieval
S. Hofstätter, B. Mitra, H. Zamani, N. Craswell, A. Hanbury
2020CIKM
(Short)
Learning to Re-Rank with Contextualized Stopwords
S. Hofstätter, A. Lipani, M. Zlabinger, A. Hanbury
2020CIKM
(Resource)
Fine-Grained Relevance Annotations for Multi-Task Document Ranking and Question Answering
S. Hofstätter, M. Zlabinger, M. Sertkan, M. Schröder, A. Hanbury
2020ECAI
(Full)
Interpretable & Time-Budget-Constrained Contextualization for Re-Ranking
S. Hofstätter, M. Zlabinger, A. Hanbury
2020ECIR
(Demo)
Neural-IR-Explorer: A Content-Focused Tool to Explore Neural Re-Ranking Results
S. Hofstätter, M. Zlabinger, A. Hanbury
2019TREC TU Wien @ TREC Deep Learning ‚Äô19 ‚Äď Simple Contextualization for Re-ranking
S. Hofstätter, M. Zlabinger, A. Hanbury
2019SIGIR
(Short)
On the Effect of Low-Frequency Terms on Neural-IR Models
S. Hofstätter, N. Rekabsaz, C. Eickhoff, A. Hanbury
2019OSIRRC
(Workshop)
Let’s measure runtime!
S. Hofstätter, A. Hanbury

For all my publications (including collaborations) visit Google or Semantic Scholar.

Teaching

All my materials are available open-source on GitHub. For a detailed description of our workflow for remote teaching see:

2022SIGSCE
(Full)
A Time-Optimized Content Creation Workflow for Remote Teaching
S. Hofstätter, S. Althammer, M. Sertkan, A. Hanbury

Advanced Information Retrieval (Summer 2021, 2020, 2019)

Full responsibility and main lecturer for the master-level course with > 100 students on neural methods for IR. Including designing, conducting, and grading of lectures, exercises, and exams.
A playlist of all lecture recordings from 2021 is available on Youtube.

Lectures: Basics of IR; Sequence modelling; Neural retrieval & re-ranking
(10 lectures total)
Exercise: Implement neural re-ranking models in PyTorch

ūüŹÜ Won Best Distance Learning Lecture & Best Teacher Award 2021 @ TU Wien


Introduction to Information Retrieval (Winter 2019, 2018)

Lectures:Inverted Index, Scoring models, Efficient & fast text processing Exercise:Implement an efficient search engine from scratch

Education

2018 - presentPhD - Computer Science - TU Wien 2016 - 2018Master's - Software Engineering - TU Wien 2012 - 2016Bachelor's - Computer Science and Economics - TU Wien