Rizwan Asif I am a technopreneur exploring the field of artificial intelligence to build the next big thing in tech. My vision for the world is to reduce human capital for low level system tasks and put humans in the higher decision making positions. We are not meant to sort stamp papers, we will give up that task to create something beautiful. 1 min read

Case Study for Semantic Search in Requirements Specification

Author: Rizwan Asif

Supervisor: Fagerholm, Fabian

Advisors: Hujanen, Jaakko; Kärkkäinen, Leo

DOI + Citation: http://urn.fi/URN:NBN:fi:aalto-202008234986

Requirements engineering is an integral part of industrial engineering processes, which provides requirements specification in the form of technical documentation. These documents utilize technical natural language which is not very common for other natural language documents. Moreover, tracing or inter-connectivity of requirements is a common practice, which is usually not found in other natural language documents.

In this thesis we create a case study to understand requirements engineering practices. The case study is based on creating a search engine that could benefit requirement engineers, while considering the natural language understanding challenge of technical documents. In order to find a better fit for requirement engineers, we instigate with a traditional search engine and then we augment this traditional search engine to train three different models thus creating additional three neural search engines. We used qualitative analysis to assess the effectiveness of each search engine and understand the user needs. This thesis contributes to the natural language understanding of requirements engineering documentation.

Our results indicate that plain text frequency based search engines are sufficient for requirements engineers, however, neural models trained with diverse set of data can improve borderline cases and improve the results altogether. These conclusions are limited to qualitative assessment due to lack of comparative data for quantitative assessment.

Reference Shelf

[1] Martin Abadi et al. “Deep learning with differential privacy”. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. 2016, pp. 308–318.

[2] Basemah Alshemali and Jugal Kalita. “Improving the reliability of deep neural networks in NLP: A review”. In: Knowledge-Based Systems 191 (2020), p. 105210.

[3] Andres Arellano, Edward Zontek-Carney, and Mark A Austin. “Frameworks for natural language processing of textual requirements”. In: International Journal On Advances in Systems and Measurements 8 (2015), pp. 230–240.

[4] Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. “Neural machine translation by jointly learning to align and translate”. In: arXiv preprint arXiv:1409.0473 (2014).

[5] Noor Hasrina Bakar, Zarinah M Kasirun, and Norsaremah Salleh. “Feature extraction approaches from natural language requirements for reuse in software product lines: A systematic literature review”. In: Journal of Systems and Software 106 (2015), pp. 132–149.

[6] Sergey Brin and Lawrence Page. “The anatomy of a large-scale hypertextual web search engine”. In: (1998).

[7] Junyoung Chung et al. “Empirical evaluation of gated recurrent neural networks on sequence modeling”. In: arXiv preprint arXiv:1412.3555 (2014).

[8] Jacob Devlin et al. “Bert: Pre-training of deep bidirectional transformers for language understanding”. In: arXiv preprint arXiv:1810.04805 (2018).

[9] Cicero Dos Santos and Maira Gatti. “Deep convolutional neural networks for sentiment analysis of short texts”. In: Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers. 2014, pp. 69–78.

[10] ECSS-E-40-Part-1B. Space engineering: Software–Part 1: Principles and requirements. 2003.

[11] ECSS-E-40-Part-2B. Space Engineering: Software–Part 2: Document requirements definitions. 2005.

[12] Christiane Fellbaum. “WordNet”. In: The encyclopedia of applied linguistics (2012).

[13] Agish George, William Taylor, and Jody Nelson. Writing Good Technical Safety Requirements. Tech. rep. SAE Technical Paper, 2016.

[14] Aurélien Géron. Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems. O’Reilly Media, 2019.

[15] Ciset Gestor. The Reuse Company. url: https://www.reusecompany.com/requirements-quality-suite-excel .

[16] Alex Graves. “Generating sequences with recurrent neural networks”. In: arXiv preprint arXiv:1308.0850 (2013).

[17] Michael Hauke et al. Functional safety of machine controls: application of EN ISO 13849. DGUV/IFA, 2019.

[18] Jane Huffman Hayes, Alex Dekhtyar, and James Osborne. “Improving requirements tracing via information retrieval”. In: Proceedings. 11th IEEE International Requirements Engineering Conference, 2003. IEEE. 2003, pp. 138–147.

[19] Jane Huffman Hayes, Alex Dekhtyar, and Senthil Karthikeyan Sundaram. “Advancing candidate link generation for requirements tracing: The study of methods”. In: IEEE Transactions on Software Engineering 32.1 (2006), pp. 4–19.

[20] Robert Hecht-Nielsen. “Theory of the backpropagation neural network”. In: Neural networks for perception. Elsevier, 1992, pp. 65–93.

[21] Djoerd Hiemstra. “A probabilistic justification for using tf× idf term weighting in information retrieval”. In: International Journal on Digital Libraries 3.2 (2000), pp. 131–139.

[22] Sepp Hochreiter and Jürgen Schmidhuber. “Long short-term memory”. In: Neural computation 9.8 (1997), pp. 1735–1780.

[23] TR IEC. “61508-0: 2005”. In: Functional safety of electrical/ electronic/program mable electronic safety related systems». Part 0:«Functional safety and IEC 61508 ().

[24] Juho Kallio. “Standardin SFS-EN ISO 13849-1 selvitystyö: Autoklaavin turvapiirit”. In: (2012).

[25] Matt Kusner et al. “From word embeddings to document distances”. In: International conference on machine learning. 2015, pp. 957–966.

[26] Guillaume Lample et al. “Neural architectures for named entity recognition”. In: arXiv preprint arXiv:1603.01360 (2016).

[27] Quoc Le and Tomas Mikolov. “Distributed representations of sentences and documents”. In: International conference on machine learning. 2014, pp. 1188–1196.

[28] Apache Lucene. “Apache Lucene-Overview”. In: Internet: http://lucene.apache.org/iava/docs/[Jan. 15, 2009] (2010).

[29] Andrew L. Maas et al. “Learning Word Vectors for Sentiment Analysis”. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Portland, Oregon, USA: Association for Computational Linguistics, June 2011, pp. 142–150. url: http://www.aclweb.org/anthology/P11-1015 .

[30] Tomas Mikolov et al. “Efficient estimation of word representations in vector space”. In: arXiv preprint arXiv:1301.3781 (2013).

[31] Diego Mollá et al. “NLP for answer extraction in technical domains”. In: Proceedings of EACL, USA (2003).

[32] ECSS Normativa. “Q-ST-10-09C, 15 novembre 2008”. In: Nonconformance control system [17].

[33] P Russel Norvig and S Artificial Intelligence. A modern approach. Prentice Hall, 2002.

[34] Aytuğ Onan, Serdar Korukoğlu, and Hasan Bulut. “Ensemble of keyword extraction methods and classifiers in text classification”. In: Expert Systems with Applications 57 (2016), pp. 232–247.

[35] SE Robertson et al. “M. Gatford (1995). Okapi at trec-3”. In: Proceedings of the Third Text REtrieval Conference (TREC-3).

[36] Richard M. Robinson and Kevin J. Anderson. “SIL Rating Fire Protection Equipment”. In: SCS. 2003.

[37] Magnus Sahlgren. “The distributional hypothesis”. In: Italian Journal of Disability Studies 20 (2008), pp. 33–53.

[38] Tobias Schnabel et al. “Evaluation methods for unsupervised word embeddings”. In: Proceedings of the 2015 conference on empirical methods in natural language processing. 2015, pp. 298–307.

[39] ECSS Secretariat. “ECSS-Q-80B, ECSS Space Product Assurance, Software Product Assurance Draft B”. In: ESA-ESTEC Requirements & Standards Division, Noordwijk, The Netherlands (2002).

[40] Lei Shu, Hu Xu, and Bing Liu. “Doc: Deep open classification of text documents”. In: arXiv preprint arXiv:1709.08716 (2017).

[41] Pramod Singh and Avinash Manure. “Natural Language Processing with TensorFlow 2.0”. In: Learn TensorFlow 2.0. Springer, 2020, pp. 107–129.

[42] Daniel Smilkov et al. “Embedding projector: Interactive visualization and interpretation of embeddings”. In: arXiv preprint arXiv:1611.05469 (2016).

[43] SFS Suomen Starndardoimisliitto. “SFS-EN ISO 12100”. In: Koneturvallisuus. Yleiset suunnitteluperiaatteet, riskin arviointi ja riskin pienentäminen 3 (2010).

[44] T. Teofili. Deep Learning for Search. Manning Publications, 2019. isbn: 9781617294792. url: https://books.google.fi/books?id=Y8yAswEACAAJ .

[45] M Van Winnendael, P Baglioni, and J Vago. “Development of the ESA ExoMars rover”. In: Proc. 8th Int. Symp. Artif. Intell., Robot. Automat. Space. Citeseer. 2005, pp. 5–8.

[46] Ashish Vaswani et al. Attention Is All You Need. 2017. arXiv: 1706. 03762 [cs.CL].

[47] Darli Rodrigues Vieira, Mohamed-Larbi Rebaiaia, and Milena Chang Chain. “The application of reliability methods for aircraft design project management”. In: American Journal of Industrial and Business Management 6.9 (2016), pp. 967–992.

[48] Oriol Vinyals et al. “Show and tell: A neural image caption generator”. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2015, pp. 3156–3164.

[49] John S Whissell and Charles LA Clarke. “Improving document clustering using Okapi BM25 feature weighting”. In: Information retrieval 14.5 (2011), pp. 466–487.

[50] Jonas Winkler and Andreas Vogelsang. “Automatic classification of requirements based on convolutional neural networks”. In: 2016 IEEE 24th International Requirements Engineering Conference Workshops (REW). IEEE. 2016, pp. 39–45.

[51] Svante Wold, Kim Esbensen, and Paul Geladi. “Principal component analysis”. In: Chemometrics and intelligent laboratory systems 2.1-3 (1987), pp. 37–52.

Rizwan Asif I am a technopreneur exploring the field of artificial intelligence to build the next big thing in tech. My vision for the world is to reduce human capital for low level system tasks and put humans in the higher decision making positions. We are not meant to sort stamp papers, we will give up that task to create something beautiful. 1 min read

Robotic Process Automation (RPA): Automating Your Office Chores

How RPA can change the way our offices operate. A case motivated guide for people who need robots in their office.

Teaching Machines About Human Ethics

Let's teach artificial intelligence they way we learn.. by storytelling. Advancement in artificial intelligence is picking up pace at a substantial level. Entering humans in to an era where decision making will be at least machine consulted, if not machine governed. Since, these intelligent machines or agents do not experience the same emotions and experiences as humans do.

Deploy Directly From GitLab to Google App Engine - Rizwan Asif - Medium

A simple tutorial to make your application ready for continuous integration. Just push to your GitLab repository and it is deployed. Google App Engine (GAE) is a robust platform for deploying web applications quickly and safely. I have been using GAE for quite some while now and have certainly faced lesser road blocks than my friends using competitive services.