Work

Overview

This section covers my academic output (publications, non-archival papers, and presentations and posters), media interviews, industry employment history, teaching experience, and other projects. For my full CV in PDF format, click here.

 


 

Publications

  • Michaelov, J. A., Levy, R. P., & Bergen, B. K. (2025). Language Model Behavioral Phases are Consistent Across Architecture, Training Data, and Scale. Advances in Neural Information Processing Systems 38 (NeurIPS 2025). [Open-Access Paper]

  • Michaelov, J. A., Estacio, R., Zhang, Z., & Bergen, B. K. (2025). Not quite Sherlock Holmes: Language model predictions do not reliably differentiate impossible from improbable events. Findings of the Association for Computational Linguistics: ACL 2025, 13528-13551. [Open-Access Paper]

  • Arnett, C., Chang, T. A., Michaelov, J. A., & Bergen, B. K. (2025). On the Acquisition of Shared Grammatical Representations in Bilingual Language Models. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025), 20707-20726. [Open-Access Paper]

  • Michaelov, J. A., Arnett, C. & Bergen, B. K. (2024). ‘Revenge of the Fallen? Recurrent Models Match Transformers at Predicting Human Language Comprehension Metrics’. First Conference on Language Modeling (COLM). [Open-Access Paper]

  • Michaelov, J. A., & Bergen, B. K. (2024). ‘On the Mathematical Relationship Between Contextual Probability and N400 Amplitude’. Open Mind, 8, 859-897. [Open-Access Paper]

  • Michaelov, J. A., Bardolph, M. D., Van Petten, C. K., Bergen, B. K., & Coulson, S. (2024). ‘Strong Prediction: Language Model Surprisal Explains Multiple N400 Effects’. Neurobiology of Language, 5(1), 107-135. [Open-Access Paper]

  • Michaelov, J. A., Arnett, C., Chang, T. A., & Bergen, B. K. (2023). ‘Structural priming demonstrates abstract grammatical representations in multilingual language models’. The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023), 3703-3720. [Open-Access Paper]

  • Michaelov, J. A. & Bergen, B. K. (2023). ‘Emergent inabilities? Inverse scaling over the course of pretraining’. Findings of the Association for Computational Linguistics: EMNLP 2023, 14607-14615. [Open-Access Paper]

  • Michaelov, J. A. & Bergen, B. K. (2023). ‘Ignoring the alternatives: The N400 is sensitive to stimulus preactivation alone’. Cortex, 168, 82-101. [Open-Access Paper]

  • Michaelov, J. A. & Bergen, B. K. (2023). ‘Rarely a problem? Language models exhibit inverse scaling in their predictions following few-type quantifiers’. Findings of the Association for Computational Linguistics: ACL 2023, 14162-14174. [Open-Access Paper]

  • Rezaii, N., Michaelov, J., Josephy‐Hernandez, S., Ren, B., Hochberg, D., Quimby, M., & Dickerson, B. C. (2023). Measuring Sentence Information via Surprisal: Theoretical and Clinical Implications in Nonfluent Aphasia. Annals of Neurology, 94, 647-657 [Open-Access Paper]

  • Trott, S., Jones, C., Chang, T., Michaelov, J., & Bergen, B. (2023). Do Large Language Models know what humans know? Cognitive Science, 47(7), e13309. [Open-Access Paper]

  • Michaelov, J. A. & Bergen, B. K. (2022). Collateral facilitation in humans and language models. Proceedings of the 26th Conference on Computational Natural Language Learning (CoNLL 2022), 13-26. [Open-Access Paper]

  • Michaelov, J. A. & Bergen, B. K. (2022). Do language models make human-like predictions about the coreferents of Italian anaphoric zero pronouns?. The 29th International Conference on Computational Linguistics (COLING 2022), 1-14. [Open-Access Paper]

  • Michaelov, J. A., Coulson, S., & Bergen, B. K. (2022). So Cloze yet so Far: N400 Amplitude is Better Predicted by Distributional Information than Human Predictability Judgements. IEEE Transactions on Cognitive and Developmental Systems, 15(3), 1033-1042. [Paper] [arXiv]

  • Michaelov, J. A., & Bergen, B. K. (2020). How well does surprisal explain N400 amplitude under different experimental conditions?. Proceedings of the 24th Conference on Computational Natural Language Learning (CoNLL2020), 652-663. [Open-Access Paper]

  • Michaelov, J. 2017. The Young and the Old: (T) Release in Elderspeak. Lifespans and Styles 3(1), 2-9. [Open-Access Paper]

Refereed conference papers (non-archival)

  • Michaelov, J. A. & Arnett, C. (2025). Disaggregation Reveals Hidden Training Dynamics: The Case of Agreement Attraction. To be presented at The First workshop on Interpreting Cognition in Deep Learning Models (CogInterp 2025). [arXiv]

  • Michaelov, J. A., Coulson, S. & Bergen, B. K. (2023). Can Peanuts Fall in Love with Distributional Semantics?. Proceedings of the Annual Meeting of the Cognitive Science Society, 45. Sydney, Australia. [Open-Access Paper]

  • Michaelov, J. A. & Bergen, B. K. (2022). The more human-like the language model, the more surprisal is the best predictor of N400 amplitude. The NeurIPS 2022 Workshop on Information-Theoretic Principles in Cognitive Systems (InfoCog). New Orleans, USA. [Open-Access Paper]

  • Jones, C. R., Chang, T. A., Coulson, S., Michaelov, J. A., Trott, S., & Bergen, B. (2022). Distrubutional Semantics Still Can’t Account for Affordances. In Proceedings of the Annual Meeting of the Cognitive Science Society, 44. Toronto, Canada. [Open-Access Paper]

  • Michaelov, J. A., Bardolph, M. D., Coulson, S., & Bergen, B. K. (2021). Different kinds of cognitive plausibility: why are transformers better than RNNs at predicting N400 amplitude?. In Proceedings of the Annual Meeting of the Cognitive Science Society, 43. University of Vienna, Vienna, Austria (Conference held online). [Open-Access Paper]

Presentations and posters

  • Michaelov, J. A., Levy, R. (2025). ‘The effect of orthographic neighborhood density on reading time in 9 languages’. Poster at The 38th Annual Conference on Human Sentence Processing (HSP 2025). University of Maryland, College Park. USA.

  • Arnett, C., Chang, T. A., Michaelov, J. A., & Bergen, B. K. (2023). ‘Crosslingual Structural Priming and the Pre-Training Dynamics of Bilingual Language Models’. Poster at The 3rd Multilingual Representation Learning Workshop (MRL 2023). [Extended Abstract]

  • Michaelov, J. A., Coulson, S., & Bergen, B. K. (2022). ‘Do we need situation models? Distributional semantics can explain how peanuts fall in love’. Spruik and Break-out session at the The 35th Annual Conference on Human Sentence Processing (HSP 2022). University of California Santa Cruz, Santa Cruz, USA (Conference held online).

  • Michaelov, J. A., Coulson, S., & Bergen, B. K. (2022). Cloze behind: Language model surprisal predicts N400 amplitude better than cloze. Spruik and Break-out session at the The 35th Annual Conference on Human Sentence Processing (HSP 2022). University of California Santa Cruz, Santa Cruz, USA (Conference held online). [Poster]

  • Michaelov, J. A., Bardolph, M. D., Coulson, S., & Bergen, B. K. (2021). Is the relationship between word probability and processing difficulty linear or logarithmic?. Short talk presentation at The 34th CUNY Conference on Human Sentence Processing (CUNY 2021). University of Pennsylvania, Philadelphia, USA (Conference held online).

  • Michaelov, J. A., Bardolph, M. D., Coulson, S., & Bergen, B. K. (2020). Surprisal is a good predict or of the N400 effect, but not for semantic relations. Oral presentation at The 26th Architectures and Mechanisms for Language Processing Conference (AMLaP 2020) as part of the Special Session: Computational models of language processing. University of Potsdam, Potsdam, Germany (Conference held online). [Abstract]

  • Michaelov, J., Culbertson, J., & Rohde, H. (2017). How universal are prominence hierarchies? Evidence from native English speakers. Poster at The 23rd Architectures and Mechanisms for Language Processing Conference (AMLaP 2017). Lancaster, UK. [Abstract] [Poster]

  • Michaelov, J. 2017. The Young and the Old: (T) Release in Elderspeak. Oral presentation at the Undergraduate Linguistics Association of Britain 2017 Conference (ULAB 2017), The University of Cambridge, Cambridge, U.K., September 4. [Abstract]

 


 

Media Interviews

Sandrine Ceurstemont (2023). Bigger, Not Necessarily Better. Communications of the ACM. (Interviewed and quoted in article). [Link to article]

 


 

Employment

Amazon

  • Applied Scientist Intern at Alexa Games (Summer 2023)

 


 

Teaching Experience

The University of California San Diego

  • TA: Introduction to Python (Spring 2024)
  • TA: Data Science in Practice (Winter 2024)
  • TA: Introduction to Data Science (Fall 2023)
  • TA: Learning, Memory, and Attention (Spring 2023)
  • TA: Neurobiology of Cognition (Winter 2023)
  • TA: Cognitive Consequences of Technology (Fall 2022)
  • TA: Cognitive Perspectives (Summer 2022)
  • TA: What the *#!?: An Uncensored Introduction to Language (Fall 2021)
  • TA: Cognitive Neuroeconomics (Fall 2020)
  • TA: Cognitive Neuroeconomics (Summer 2020)
  • TA: Language Comprehension (Summer 2020)
  • TA: Cognitive Neuroeconomics (Winter 2020)
  • TA: What the *#!?: An Uncensored Introduction to Language (Fall 2019)
  • TA: Minds and Brains (Spring 2019)

The University of Edinburgh

  • Tutor: Logic 1 (2018)
  • Tutor: Informatics 1: Computation and Logic (2017)

 


 

Other Projects