www.mohammed-attia.com Open in urlscan Pro
88.208.252.133  Public Scan

URL: https://www.mohammed-attia.com/
Submission: On September 20 via api from US — Scanned from GB

Form analysis 0 forms found in the DOM

Text Content

The Arabic morphological transducer and finite state tools, such as tokenizer
and guesser, are developed using finite state technology.

 Mohammed A. Attia

NLP Research and Resources

  Welcome to my website ..




Mohammed Attia




Analytical Linguist, Nov 2014
Google Inc.

Research Scientist, Oct 2013 - Nov 2014
George Washington University.

Lexicographer, Oct 2012 - Oct 2013
Oxford University Press.

Research Fellow, Feb - Oct 2012
The British University in Dubai,
United Arab Emirates.

Post-doctoral Researcher, Sept 2009 - Dec 2011,
School of Computing, Dublin City University,
Dublin, Ireland.

Lecturer in Linguistics, Sept 2008 - Aug 2009,
Al-Azhar University, Cairo, Egypt.

Ph.D. in Computational Linguistics, Apr 2004 - May 2008,
School of Languages, Linguistics and Cultures,
The University of Manchester, UK.

Translator and Web Developer, Jan 1996 - Apr 2004
Harf Information Technology,
Egypt.



Research outcomes: 1. AraComLex Open-Source Morphological Analyser for Modern
Standard Arabic [link]
I developed an open-source large-scale finite state morphological transducer for
processing Arabic texts, AraComLex, or Arabic Computer Lexicon, containing more
than 30,000 lemmas. The competitive edge this morphology has over Buckwalter's
is that it tried be specialized purely in MSA by avoiding the noise coming from
Classical Arabic and the wrong word-clitic formation which are rampant in
Buckwalter's morphology. My morphology is compatible with the open-source finite
state compiler Foma. All you need to do is download Foma, download AraComLex
from Sourceforge.net and read the README file to learn how to compile. You can
compile the transducer under Windows, Linux or Mac OS X. +Show Reference:
Mohammed Attia, Pavel Pecina, Antonio Toral, Josef van Genabith. 2013. A
Corpus-Based Finite-State Morphological Toolkit for Contemporary Arabic. Journal
of Logic and Computation 2013; doi: 10.1093/logcom/exs070. Oxford University
Press. [pdf version]


2. Arabic Morphology Patterns [link]
I developed a database of 490 templatic patterns for Arabic (الأوزان الصرفية في
اللغة العربية) that has been successfully used in detecting unknown words in a
statistical parser and in lexical profiling tasks. [Download from
Sourceforge.net] +Show Reference:
Mohammed Attia, Pavel Pecina, Lamia Tounsi, Antonio Toral, Josef van Genabith.
2011. Lexical Profiling for Arabic. Electronic Lexicography in the 21st Century.
Bled, Slovenia. [pdf version]


3. Arabic Subcategorization Frames in the LFG Parser [link]
I manually developed a list of subcategorization frames to be used in the Arabic
LFG parser, containing 2901 lemma-frame types. [Download from Sourceforge.net]
+Show Reference:
Mohammed Attia. (2008) 'Handling Arabic Morphological and Syntactic Ambiguity
within the LFG Framework with a View to Machine Translation'. PhD Thesis. School
of Languages, Linguistics and Cultures, the University of Manchester. [pdf
version]


4. Arabic Subcategorization Frames in the Arabic Treebank [link]
I automatically extracted the list of subcategorization frames (following the
LFG syntactic theory) from the Arabic Treebank, containing 7746 lemma-frame
types for verbs, nouns and adjectives. [Download from Sourceforge.net] +Show
Reference:
Mohammed Attia, Khaled Shaalan, Lamia Tounsi, and Josef van Genabith. 2012.
Automatic Extraction and Evaluation of Arabic LFG Resources. Language Resources
and Evaluation (LREC). Istanbul, Turkey. Pages 1947-1954. [pdf version]


5. Arabic Wordlist for Spellchecking [link]
I developed the Arabic word list for spell checking containing 9 million Arabic
words. The words are automatically generated from the AraComLex open-source
finite state transducer and from a one billion word corpus. The entire list is
validated against Microsoft Word spell checker. [Download from Sourceforge.net]
+Show Reference:
Attia, Mohammed, Pavel Pecina, Younes Samih, Khaled Shaalan, Josef van Genabith.
2012. Improved Spelling Error Detection and Correction for Arabic. COLING 2012,
Bumbai, India. [pdf version]


6. Named Entities and Multiword Expressions
I developed the largest lexical database for named entities and multiword
expressions to date using automatic methods to process a large corpus of over
one billion words. Multiword expression resources for Arabic, totalling 34,658
MWEs (Download from Sourceforge.net). Arabic Named Entities, 45,202 entries
(Download from Sourceforge.net) +Show Reference:
Mohammed Attia, Antonio Toral, Lamia Tounsi, Monica Monachini and Josef van
Genabith. 2010. 'An automatically built Named Entity lexicon for Arabic'. LREC
2010. Valletta, Malta. [pdf version]
Mohammed Attia, Antonio Toral, Lamia Tounsi, Pavel Pecina and Josef van
Genabith. 2010. Automatic Extraction of Arabic Multiword Expressions. COLING
2010 Workshop on Multiword Expressions: from Theory to Applications. Beijing,
China. [pdf version]


7.a. Word Count of Modern Standard Arabic [link]
I developed A word count of Modern Standard Arabic from a 1 billion word corpus,
sorted according to frequency counts. [Download from Sourceforge.net] +Show
Reference:
Mohammed Attia, Pavel Pecina, Lamia Tounsi, Antonio Toral, Josef van Genabith.
2011. A Lexical Database for Modern Standard Arabic Interoperable with a Finite
State Morphological Transducer. In Mahlow, Cerstin; Piotrowski, Michael (Eds.)
Systems and Frameworks for Computational Morphology. Second International
Workshop, SFCM 2011, Zurich, Switzerland, August 26, 2011, Proceedings. Series:
Communications in Computer and Information Science, Vol. 100. 1st Edition. [pdf
version]


7.b. Word Count of Modern Standard Arabic - with diversity and full forms [link]
I developed A word count of Modern Standard Arabic from a large and diverse
collection of corpora, sorted according to a combination of diversity and
frequency counts. [Download from Sourceforge.net] +Show Reference:
Mohammed Attia, Pavel Pecina, Lamia Tounsi, Antonio Toral, Josef van Genabith.
2011. A Lexical Database for Modern Standard Arabic Interoperable with a Finite
State Morphological Transducer. In Mahlow, Cerstin; Piotrowski, Michael (Eds.)
Systems and Frameworks for Computational Morphology. Second International
Workshop, SFCM 2011, Zurich, Switzerland, August 26, 2011, Proceedings. Series:
Communications in Computer and Information Science, Vol. 100. 1st Edition. [pdf
version]


8. Arabic Broken Plurals [link]
A list of Arabic Broken Plurals automatically extracted from a large
contemporary corpus, provided with morphological patterns for both the singular
forms and the plural forms. It contains 2562 broken plural forms. [Download from
Sourceforge.net] +Show Reference:
Mohammed Attia, Pavel Pecina, Lamia Tounsi, Antonio Toral, Josef van Genabith.
2011. Lexical Profiling for Arabic. Electronic Lexicography in the 21st Century.
Bled, Slovenia. [pdf version]


9. Arabic Unknown Words - Weighted [link]
This is a list of unknown words, or words that are not included in the
Buckwalter Morphological Analyser version 2.0. It includes about 18,000 new
lemmatized words, and they are weighted and ordered so that there is a good
likelihood that words which are most relevant (lexicographically) will surface
to the top and the least relevant words will be pushed down the list. [Download
from Sourceforge.net] +Show Reference:
Attia, Mohammed, Younes Samih, Khaled Shaalan, Josef van Genabith. 2012. The
Floating Arabic Dictionary: An Automatic Method for Updating a Lexical Database.
COLING 2012, Bumbai, India. [pdf version]


10. Obsolete Arabic Words [link]
This is a list of obsolete words, or words that are outdated or not in
contemporary use, in the Buckwalter Morphological Analyser database. This list
is developed according to a threshold of frequency on the web and the Arabic
gigaword corpus. The list contain about 8,400 words that fell out of current use
with a margin error of 1%. [Download from Sourceforge.net] +Show Reference:
Mohammed Attia, Pavel Pecina, Lamia Tounsi, Antonio Toral, Josef van Genabith.
2011. A Lexical Database for Modern Standard Arabic Interoperable with a Finite
State Morphological Transducer. In Mahlow, Cerstin; Piotrowski, Michael (Eds.)
Systems and Frameworks for Computational Morphology. Second International
Workshop, SFCM 2011, Zurich, Switzerland, August 26, 2011, Proceedings. Series:
Communications in Computer and Information Science, Vol. 100. 1st Edition. [pdf
version]


11. AraComLex Lexical Web Application for Modern Standard Arabic
I developed a web application (dictionary writing system) for curating a
large-scale, corpus-driven lexical database for Modern Standard Arabic following
the modern lexicographic practices containing 30,000 lemmas. +Show Reference:
Mohammed Attia, Pavel Pecina, Lamia Tounsi, Antonio Toral, Josef van Genabith.
2011. A Lexical Database for Modern Standard Arabic Interoperable with a Finite
State Morphological Transducer. In Mahlow, Cerstin; Piotrowski, Michael (Eds.)
Systems and Frameworks for Computational Morphology. Second International
Workshop, SFCM 2011, Zurich, Switzerland, August 26, 2011, Proceedings. Series:
Communications in Computer and Information Science, Vol. 100. 1st Edition. [pdf
version]


12. Tharwa Lexical Web Application for Colloquial Arabic
I developed a web application for curating a large-scale lexical database for
Colloquial Arabic for the GWU.: [View here] +Show Reference:
Mona Diab, Mohamed Al-Badrashiny, Maryam Aminian, Mohammed Attia, Pradeep
Dasigi, Heba Elfardy, Ramy Eskander, Nizar Habash, Abdelati Hawwari, Wael
Salloum. (2014) Tharwa: A Large Scale Dialectal Arabic - Standard Arabic -
English Lexicon. The 9th edition of the Language Resources and Evaluation (LREC)
Conference, 26-31 May, Reykjavik, Iceland. [pdf version]


13. Arabic LFG Rule-basic Parser for Modern Standard Arabic
I the developed first Arabic rule-based parser to be freely available on the
internet for Modern Standard Arabic, using XLE. The output this parser gives is
a phrase structure tree (c-structure) and a dependency structure (f-structure).
The parser is hosted by Bergen University in Norway, along with English, German,
Malagasy, Norwegian and Welsh. Test the parser here +Show Reference:
Mohammed Attia. (2008) 'Handling Arabic Morphological and Syntactic Ambiguity
within the LFG Framework with a View to Machine Translation'. PhD Thesis. School
of Languages, Linguistics and Cultures, the University of Manchester. [pdf
version]


Ph.D. thesis:
Title: Handling Arabic Morphological and Syntactic Ambiguity within the LFG
Framework with a View to Machine Translation.
Description: This research investigates different methodologies to manage the
problem of morphological and syntactic ambiguities in Arabic. I built an Arabic
parser using XLE (Xerox Linguistics Environment) which allows writing grammar
rules and notations that follow the LFG formalisms. I also formulate a
description of main syntactic structures in Arabic within the LFG framework.
Mohammed Attia. (2008) 'Handling Arabic Morphological and Syntactic Ambiguity
within the LFG Framework with a View to Machine Translation'. PhD Thesis. School
of Languages, Linguistics and Cultures, the University of Manchester. [pdf
version]

Publications

Dictionaries:

 * Tressy Arts, Radia Benzehra, Mohammed Attia, et al. 2014. Oxford Arabic
   Dictionary. Oxford Arabic Dictionary, ISBN 978-0-19-958033-0. August 2014
   (estimated)

Books:

 * 
   Mohammed Attia, Jason Marino, Dorothy Bayern, Brenna Saunders. 2019. Live and
   Learn: Current English Proverbs, Explained. Kindle Edition link
 * 
   Mohammed Attia. 2012. Ambiguity In Arabic Computational Morphology And
   Syntax: A Study within the Lexical Functional Grammar Framework. LAP Lambert
   Academic Publishing, ISBN 978-3-8484-4967-5

Book Chapters:

 1. Mohammed Attia, Pavel Pecina, Lamia Tounsi, Antonio Toral, Josef van
    Genabith. 2011. A Lexical Database for Modern Standard Arabic Interoperable
    with a Finite State Morphological Transducer. In Mahlow, Cerstin;
    Piotrowski, Michael (Eds.) Systems and Frameworks for Computational
    Morphology. Second International Workshop, SFCM 2011, Zurich, Switzerland,
    August 26, 2011, Proceedings. Series: Communications in Computer and
    Information Science, Vol. 100. 1st Edition. [pdf version]
 2. Mohammed Attia. (2006) 'Accommodating Multiword Expressions in an Arabic LFG
    Grammar'. In T. Salakoski et al. (Eds.): Advances in Natural Language
    Processing. FinTAL 2006, Lecture Notes in Computer Science. Vol. 4139, pp.
    87 - 98, 2006. Springer-Verlag Berlin Heidelberg 2006. [pdf version]

Journal Papers:

 * Attia, Mohammed, Pavel Pecina, Younes Samih, Khaled Shaalan, Josef van
   Genabith. 2015. Arabic Spelling Error Detection and Correction. Journal of
   Natural Language Engineering, Cambridge University Press. [URL]
 * Mohammed Attia, Pavel Pecina, Antonio Toral, Josef van Genabith. 2013. A
   Corpus-Based Finite-State Morphological Toolkit for Contemporary Arabic.
   Journal of Logic and Computation 2013; doi: 10.1093/logcom/exs070. Oxford
   University Press. [pdf version]

Theses:

 * Mohammed Attia. 2008. Handling Arabic morphological and syntactic ambiguity
   within the LFG framework with a view to machine translation. Ph.D. Thesis.
   School of Languages, Linguistics and Cultures, the University of Manchester,
   UK. [pdf version]
 * Mohammed Attia. 2002. Implications of the agreement features in machine
   translation. Master's Thesis. Faculty of Languages and Translation, Al-Azhar
   University, Cairo, Egypt. [pdf version]

Conference Papers:

 1.  Mohammed Attia, Younes Samih, Yo Ehara. 2023. Statistical Measures for
     Readability Assessment. Proceedings of the Joint 3rd International
     Conference on Natural Language Processing for Digital Humanities. [link]
 2.  Kareem Darwish, Mohammed Attia, Hamdy Mubarak, Younes Samih, Ahmed
     Abdelali, Lluís Màrquez, Mohamed Eldesouki and Laura Kallmeyer. 2020.
     Effective Multi Dialectal Arabic POS Tagging. Natural Language Engineering,
     doi:10.1017/S1351324920000078. [link]
 3.  Mohammed Attia and Ali Elkahky. 2019. Segmentation for Domain Adaptation in
     Arabic. Workshop on Arabic Natural Language Processing -- ACL 2019,
     Florence, Italy (2019). [link]
 4.  Mohammed Attia, Ahmed Abdelali, Ali Elkahky, Hamdy Mubarak, Kareem Darwish,
     and Younes Samih. 2019. POS Tagging for Improving Code-Switching
     Identification in Arabic. Workshop on Arabic Natural Language Processing --
     ACL 2019, Florence, Italy (2019). [link]
 5.  Ahmed Abdelali, Hamdy Mubarak, Kareem Darwish, Mohamed Eldesouki, Mohammed
     Attia and Younes Samih. 2019. QC-GO Submission for MADAR Shared Task:
     Arabic Fine-Grained Dialect Identification. MADAR Shared on Dialect
     Identification -- ACL 2019 (2019). [link]
 6.  Ahmed Abdelali, Mohammed Attia, Younes Samihy, Kareem Darwish, Hamdy
     Mubarak. 2018. Diacritization of Maghrebi Arabic Sub-Dialects. arXiv
     preprint arXiv:1810.06619 (2018). [pdf version]
 7.  Kareem Darwish, Ahmed Abdelali, Hamdy Mubarak, Younes Samih, Mohammed
     Attia. 2018. Diacritization of Moroccan and Tunisian Arabic Dialects: A CRF
     Approach. The 3rd Workshop on Open-Source Arabic Corpora and Processing
     Tools in the Proceedings of the Eleventh International Conference on
     Language Resources and Evaluation (LREC 2018), European Language Resources
     Association (ELRA), Miyazaki, Japan (2018). [pdf version]
 8.  Mohammed Attia, Younes Samih, Manaal Faruqui, Wolfgang Maier. 2018. GHH at
     SemEval-2018 Task 10: Discovering Discriminative Attributes in
     Distributional Semantics. SemEval 2018 Task 10 on Capturing Discriminative
     Attributes, pages 947–952. New Orleans, Louisiana (2018). [pdf version]
 9.  Mohammed Attia, Younes Samih, Wolfgang Maier. 2018. GHHT at CALCS 2018:
     Named Entity Recognition for Dialectal Arabic Using Neural Networks. Third
     Workshop on Computational Approaches to Linguistic Code-switching in ACL
     2018, pages 98–102, Melbourne, Australia (2018). [pdf version]
 10. Kareem Darwish, Hamdy Mubarak, Ahmed Abdelali, Mohamed Eldesouki, Younes
     Samih, Randah Alharbi, Mohammed Attia, Walid Magdy, Laura Kallmeyer. 2018.
     Multi-Dialect Arabic POS Tagging: A CRF Approach. Proceedings of the
     Eleventh International Conference on Language Resources and Evaluation
     (LREC 2018), European Language Resources Association (ELRA), Miyazaki,
     Japan (2018), pp. 93-98. [pdf version]
 11. Mohammed Attia, Younes Samih, Ali Elkahky, Laura Kallmeyer. 2018.
     Multilingual Multi-class Sentiment Classification Using Convolutional
     Neural Networks. Proceedings of the Eleventh International Conference on
     Language Resources and Evaluation (LREC 2018), European Language Resources
     Association (ELRA), Miyazaki, Japan (2018), pp. 635-640. [pdf version]
 12. Mohammed Attia, Vitaly Nikolaev, Ali Elkahky. 2018. The Morpho-syntactic
     Annotation of Animacy for a Dependency Parser. Proceedings of the Eleventh
     International Conference on Language Resources and Evaluation (LREC 2018),
     European Language Resources Association (ELRA), Miyazaki, Japan (2018), pp.
     2607-2615. [pdf version]
 13. Mohamed Eldesouki, Younes Samih, Ahmed Abdelali, Mohammed Attia, Hamdy
     Mubarak, Kareem Darwish, Kallmeyer Laura. 2017. Arabic Multi-Dialect
     Segmentation: bi-LSTM-CRF vs. SVM. arXiv preprint arXiv:1708.05891. [pdf
     version]
 14. Daniel Zeman, Martin Popel, Milan Straka, Jan Hajic, Joakim Nivre, Filip
     Ginter, Juhani Luotolahti, Sampo Pyysalo, Slav Petrov, Martin Potthast,
     Francis Tyers, Elena Badmaeva, Memduh Gokirmak, Anna Nedoluzhko, Silvie
     Cinková, Jan Hajic jr, Jaroslava Hlavácová, Václava Kettnerová, Zdenka
     Uresova, Jenna Kanerva, Stina Ojala, Anna Missilä, Christopher D Manning,
     Sebastian Schuster, Siva Reddy, Dima Taji, Nizar Habash, Herman Leung,
     Marie-Catherine de Marneffe, Manuela Sanguinetti, Maria Simi, Hiroshi
     Kanayama, Kira Droganova, Héctor Martínez Alonso, Çağrı Çöltekin, Umut
     Sulubacak, Hans Uszkoreit, Vivien Macketanz, Aljoscha Burchardt, Kim
     Harris, Katrin Marheinecke, Georg Rehm, Tolga Kayadelen, Mohammed Attia,
     Ali Elkahky, Zhuoran Yu, Emily Pitler, Saran Lertpradit, Michael Mandl,
     Jesse Kirchner, Hector Fernandez Alcalde, Jana Strnadová, Esha Banerjee,
     Ruli Manurung, Antonio Stella, Atsuko Shimada, Sookyoung Kwak, Gustavo
     Mendonca, Tatiana Lando, Rattima Nitisaroj, Josie Li. 2017. CoNLL 2017
     Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies.
     Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw
     Text to Universal Dependencies. Vancouver, Canada. Pages: 1-19. [pdf
     version]
 15. Younes Samih, Mohamed Eldesouki, Mohammed Attia, Ahmed Abdelali, Hamdy
     Mubarak, Kareem Darwish, Laura Kallmeyer. 2017. Learning from Relatives:
     Unified Dialectal Arabic Segmentation. CONLL, Vancouver, Canada (2017).
     [pdf version]
 16. Younes Samih, Mohammed Attia, Mohamed Eldesouki, Hamdy Mubarak, Ahmed
     Abdelali, Laura Kallmeyer, Kareem Darwish. A Neural Architecture for
     Dialectal Arabic Segmentation. The Third Arabic Natural Language Processing
     Workshop (WANLP), Valencia, Spain (2017), pp. 46-54. [pdf version]
 17. Mohammed Attia, Ryan Mcdonald, Slav Petrov, Tolga Kayadelen (2017). PoS,
     Morphology and Dependencies Annotation Guidelines for Arabic. Technical
     Report. Google Inc. [pdf version]
 18. Mohammed Attia, Suraj Maharjan, Younes Samih, Laura Kallmeyer, Thamar
     Solorio. 2016. CogALex-V Shared Task: GHHH - Detecting Semantic Relations
     via Word Embeddings. CogALex-2016 Shared Task on the Corpus-Based
     Identification of Semantic Relations, Osaka, Japan (2016), pp. 86-91. [pdf
     version]
 19. Younes Samih, Suraj Maharjan, Mohammed Attia, Laura Kallmeyer, Thamar
     Solorio. 2016. Multilingual Code-switching Identification via LSTM
     Recurrent Neural Networks. Proceedings of the Second Workshop on
     Computational Approaches to Code Switching, Austin, TX (2016), pp. 50-59.
     [pdf version]
 20. Mohammed Attia, Ayah Zirizkly, Mona Diab. 2016. The Power of Language
     Music: Arabic Lemmatization through Patterns. Proceedings of the Workshop
     on Cognitive Aspects of the Lexicon, Osaka, Japan (2016), pp. 40-50. [pdf
     version]
 21. Abdelati Hawwari, Mohammed Attia, Mahmoud Ghoneim and Mona Diab. 2016.
     Explicit Fine grained Syntactic and Semantic Annotation of the Idafa
     Construction in Arabic. In Proceedings of LREC 2016, Slovenia, May 2016.
     [pdf version]
 22. Mohammed Attia, Mohamed Al-Badrashiny and Mona Diab. 2015.
     GWU-HASP-2015@QALB-2015 Shared Task: Priming Spelling Candidates with
     Probability. In the second workshop on Arabic Natural Language Processing
     (ACL-IJCNLP 2015), Beijing, China, July 2015. [pdf version]
 23. Mohammed Attia, Mohamed Al-Badrashiny, Mona Diab. 2014. GWU-HASP: Hybrid
     Arabic Spelling and Punctuation Corrector. Proceedings of the EMNLP 2014
     Workshop on Arabic Natural Langauge Processing (ANLP), pages 148–154,
     October 25, 2014, Doha, Qatar. [pdf version]
 24. Abdelati Hawwari, Mohammed Attia, Mona Diab. 2014. A Framework for the
     Classification and Annotation of Multiword Expressions in Dialectal Arabic.
     Proceedings of the EMNLP 2014 Workshop on Arabic Natural Langauge
     Processing (ANLP), pages 48–56, October 25, 2014, Doha, Qatar. [pdf
     version]
 25. Mona Diab, Mohamed Al-Badrashiny, Maryam Aminian, Mohammed Attia, Heba
     Elfardy, Nizar Habash and Abdelati Hawwari. 2014. Tharwa: A Large Scale
     Dialectal Arabic - Standard Arabic - English Lexicon. The 9th edition of
     the Language Resources and Evaluation (LREC) Conference, 26-31 May,
     Reykjavik, Iceland. [pdf version]
 26. Attia, Mohammed and Josef van Genabith. 2013. A Jellyfish Dictionary for
     Arabic. eLex2013 Conference (Electronic Lexicography in the 21st Century),
     Tallinn, Estonia. [pdf version]
 27. Attia, Mohammed, Pavel Pecina, Younes Samih, Khaled Shaalan, Josef van
     Genabith. 2012. Improved Spelling Error Detection and Correction for
     Arabic. COLING 2012, Bumbai, India. [pdf version]
 28. Attia, Mohammed, Younes Samih, Khaled Shaalan, Josef van Genabith. 2012.
     The Floating Arabic Dictionary: An Automatic Method for Updating a Lexical
     Database. COLING 2012, Bumbai, India. [pdf version]
 29. Khaled Shaalan,Younes Samih, Mohammed Attia, Pavel Pecina, and Josef van
     Genabith. 2012. Arabic Word Generation and Modelling for Spell Checking.
     Language Resources and Evaluation (LREC). Istanbul, Turkey. Pages: 719-725.
     [pdf version]
 30. Shaalan, K., and Attia, M. 2012. Handling Unknown Words in Arabic FST
     Morphology. The 10th edition of the International Workshop on Finite State
     Methods and Natural Language Processing (FSMNLP) 2012, Donostia - San
     Sebastian, Spain, July 23-25, 2012. [pdf version]
 31. Mohammed Attia, Khaled Shaalan, Lamia Tounsi, and Josef van Genabith. 2012.
     Automatic Extraction and Evaluation of Arabic LFG Resources. Language
     Resources and Evaluation (LREC). Istanbul, Turkey. Pages 1947-1954. [pdf
     version]
 32. Mohammed Attia, Pavel Pecina, Lamia Tounsi, Antonio Toral, Josef van
     Genabith. 2011. Lexical Profiling for Arabic. Electronic Lexicography in
     the 21st Century. Bled, Slovenia. [pdf version]
 33. Mohammed Attia, Pavel Pecina, Lamia Tounsi, Antonio Toral, Josef van
     Genabith. 2011. An Open-Source Finite State Morphological Transducer for
     Modern Standard Arabic. International Workshop on Finite State Methods and
     Natural Language Processing (FSMNLP). Blois, France. [pdf version]
 34. Mohammed Attia, Antonio Toral, Lamia Tounsi, Pavel Pecina, Josef van
     Genabith. 2010. Construction of Language Resources for Enhancing Future
     Information Technologies. Poster presented at the Globe Forum Dublin 2010.
     The Convention Centre Dublin. Ireland.[pdf version]
 35. Mohammed Attia, Antonio Toral, Lamia Tounsi, Pavel Pecina and Josef van
     Genabith. 2010. Automatic Extraction of Arabic Multiword Expressions.
     COLING 2010 Workshop on Multiword Expressions: from Theory to Applications.
     Beijing, China. [pdf version]
 36. Mohammed Attia, Jennifer Foster, Deirdre Hogan, Joseph Le Roux, Lamia
     Tounsi and Josef van Genabith. 2010. 'Handling Unknown Words in Statistical
     Latent-Variable Parsing Models for Arabic, English and French'. First
     Workshop on Statistical Parsing of Morphologically Rich Languages (SPMRL
     2010), NAACL HLT. Los Angeles, CA. [pdf version]
 37. Mohammed Attia, Antonio Toral, Lamia Tounsi, Monica Monachini and Josef van
     Genabith. 2010. 'An automatically built Named Entity lexicon for Arabic'.
     LREC 2010. Valletta, Malta. [pdf version]
 38. Lamia Tounsi, Mohammed Attia and Josef van Genabith. 2009. 'Parsing Arabic
     Using Treebank-Based LFG Resources'. LFG09: 14th International LFG
     Conference, Trinity College, Cambridge, UK. [pdf version]
 39. Lamia Tounsi, Mohammed Attia and Josef van Genabith. 2009. 'Automatic
     Treebank-Based Acquisition of Arabic LFG Dependency Structures.'
     EACL-Workshop on Computational Approaches to Semitic Languages, Athens,
     Greece.[pdf version]
 40. Mohammed Attia. 2008. 'A Unified Analysis of Copula Constructions in LFG'.
     LFG08: 13th International LFG Conference, University of Sydney, Australia.
     [pdf version]
 41. Mohammed Attia. 2007. 'Arabic Tokenization System'. ACL-Workshop on
     Computational Approaches to Semitic Languages, Prague. [pdf version]
 42. Mohammed Attia. 2006. 'An Ambiguity-Controlled Morphological Analyzer for
     Modern Standard Arabic Modelling Finite State Networks'. The Challenge of
     Arabic for NLP/MT Conference, October 2006. The British Computer Society,
     London. [pdf version]
 43. Mohammed Attia. 2005. 'Developing a Robust Arabic Morphological Transducer
     Using Finite State Technology'. 8th Annual CLUK Research Colloquium,
     Manchester. [pdf version]

Technical Reports:

 * Mohammed Attia. 2010. 'Automatic Lexical Resource Acquisition for
   Constructing an LMF-Compatible Lexicon of Modern Standard Arabic'. The NCLT
   Seminar Series, DCU, Dublin, Ireland. [pdf version]
 * Mohammed Attia. 2008. 'Alternate Agreement in Arabic'. The ParGram Spring
   Meeting, Istanbul, Turkey. [pdf version]
 * Mohammed Attia. 2005. Functional Control and Long Distance Dependencies in
   Arabic. Parallel Grammar (ParGram) Meeting, Gotemba, Japan 2005. [pdf
   version]
 * Mohammed Attia. 2004. Report on the Introduction of Arabic to ParGram. The
   ParGram Fall Meeting 2004, The National Centre for Language Technology,
   School of Computing, Dublin City University, Ireland. [pdf version]

Presentations:

 * Mohammed Attia. 2012. 'Arabic Language: Nature and Challenges'. A
   presentation at the the British University in Dubai, UAE, May 29, 2012.
   [Slides available]
 * Mohammed Attia. 2010. 'Automatic Lexical Resource Acquisition for
   Constructing an LMF-Compatible Lexicon of Modern Standard Arabic'. A
   presentation at the NCLT, Dublin City University, Ireland. [Slides available]
 * Mohammed Attia. 2008. 'From Arabic Handcrafted Grammar to Statistical
   Parsing'. A presentation at the NCLT, Dublin City University, Ireland.
   [Slides available]
 * Mohammed Attia. 2008. 'Alternate Agreement in Arabic'. Presented on my behalf
   in the ParGram Spring Meeting, Istanbul, Turkey. [Slides available]
 * Mohammed Attia .2006. 'Issues in Arabic Grammar: from Tokenization to
   Transfer'. A presentation at the ParGram Meeting, Oxford, UK. [Slides
   available]
 * Mohammed Attia. 2005. 'Functional and Anaphoric Control in Arabic'. A
   presentation at ParGram Fall Meeting, Gotemba, Japan. [Slides available]
 * Mohammed Attia. 2005. 'Accommodating Multiword Expressions in an LFG
   Grammar'. A presentation at ParGram Fall Meeting, Gotemba, Japan. [Slides
   available]
 * Mohammed Attia. 2005. 'Developing a Robust Arabic Morphological
   Transducer/Tokenizer, and Integration with XLE'. Presented on my behalf in
   the ParGram Spring Meeting, Parc, Palo Alto, USA. [Slides available]
 * Mohammed Attia. 2004. 'Report on the Introduction of Arabic to ParGram'.
   Presented at ParGram Fall Meeting, Dublin, Ireland. [pdf version]

E-Books:

 * Mohammed Attia. 2003. 'Implications of the Agreement Features in Machine
   Translation'. M.A. Thesis.
 * Mohammed Attia. 2004. 'Common English Proverbs'. E-Books.
 * Mohammed Attia. 2007. 'Common English Expressions'. E-Books.
 * Mohammed Attia. 2008. 'Handling Arabic Morphological and Syntactic Ambiguity
   within the LFG Framework with a View to Machine Translation'. PhD Thesis.
   School of Languages, Linguistics and Cultures, the University of Manchester.
   [pdf version]
 * Mohammed Attia. 2009. 'The Translation Manual'. E-Books.
 * Mohammed Attia. 2009. 'The Translation Terminology Aid'. E-Books.
 * Mohammed Attia. 2009. Pigeon: A Collection of Poems'. E-Books.
 * Mohammed Attia. 2009. Basic English Words: A Vocabulary Bootstrap for
   Beginning Learners'. E-Books.
 * Mohammed Attia. 2009. 'Arabic Grammar Summary: A Digest of Badawi et. al.
   2004 "Modern Written Arabic, A Comprehensive Grammar"'. E-Books.
 * Mohammed Attia, Mohammed Fadel, Hamdi Mansour. 2000. 'English Grammar for
   Arabs'. E-Books.


Copyrights © 2003-2024. Mohammed Attia