polyglot-tools-docs

Bibliography

This is the books and research I've looked at - it needs a bit of cleaning, not all the papers here are actually useful!

I have notes that used to be here - I'm trying to clean them up, they'll reappear on another page soon. (Inline notes didn't play nicely with Zotero which I'm using to manage the research)

A note on research limitations

It's fascinating reading a lot of this research. An awful lot of it draws conclusions, or implies conclusions, based on one particular kind of codebase or kind of organisation; the trouble is, software development is extremely diverse, in kinds of organisation, software architectures, and software development practices. And it seems a lot of academic research is done without much effort to think about these differences - partly because it's actually quite hard to get your hands on a diverse range of commercial software sources! And quite a bit of research is 20+ years old, from giant organisations, so it's unlikely to talk about microservices or TDD all that much.

Books

Your Code as a Crime Scene (Tornhill 2015)

Tornhill, Adam. 2015. Your Code as a Crime Scene: Use Forensic Techniques to Arrest Defects, Bottlenecks, and Bad Design in Your Programs. 1 edition. Dallas, Texas: Pragmatic Bookshelf.

GoodReads

A great overview of code-maat and other related code investigation techniques. This was the book that started me looking at alternatives to traditional code metrics!

Making Software: What Really Works, and Why We Believe It (Oram and Wilson 2010)

Oram, Andy, and Greg Wilson. 2010. Making Software: What Really Works, and Why We Believe It. 1 edition. O’Reilly Media.

A collection of research work around a wide range of software areas - including a good summary of the Bell, Ostrand and Weyuker papers - these folks have a bunch of papers which build on each other:

There are probably more! TODO: see where they are now

Papers on Metrics

Predicting Fault Incidence Using Software Change History (Graves et al. 2000)

Graves, T.L., A.F. Karr, J.S. Marron, and H. Siy. 2000. ‘Predicting Fault Incidence Using Software Change History’. IEEE Transactions on Software Engineering 26 (7): 653–61. https://doi.org/10.1109/32.859533.

Does Code Decay? Assessing the Evidence from Change Management Data (Eick et al. 2001)

Eick, S.G., T.L. Graves, A.F. Karr, J.S. Marron, and A. Mockus. 2001. ‘Does Code Decay? Assessing the Evidence from Change Management Data’. IEEE Transactions on Software Engineering 27 (1): 1–12. https://doi.org/10.1109/32.895984.

Where the Bugs Are (Ostrand, Weyuker, and Bell 2004)

Ostrand, Thomas J., Elaine J. Weyuker, and Robert M. Bell. 2004. ‘Where the Bugs Are’. In Proceedings of the 2004 ACM SIGSOFT International Symposium on Software Testing and Analysis, 86–96. ISSTA ’04. New York, NY, USA: Association for Computing Machinery. https://doi.org/10.1145/1007512.1007524.

Predicting Source Code Changes by Mining Change History (Ying et al. 2004)

Ying, A.T.T., G.C. Murphy, R. Ng, and M.C. Chu-Carroll. 2004. ‘Predicting Source Code Changes by Mining Change History’. IEEE Transactions on Software Engineering 30 (9): 574–86. https://doi.org/10.1109/TSE.2004.52.

Use of Relative Code Churn Measures to Predict System Defect Density (Nagappan and Ball 2005)

Nagappan, Nachiappan, and Thomas Ball. 2005. ‘Use of Relative Code Churn Measures to Predict System Defect Density’. In Proceedings of the 27th International Conference on Software Engineering, 284–292. ICSE ’05. New York, NY, USA: Association for Computing Machinery. https://doi.org/10.1145/1062455.1062514.

Looking for Bugs in All the Right Places (Bell, Ostrand, and Weyuker 2006)

Bell, Robert M., Thomas J. Ostrand, and Elaine J. Weyuker. 2006. ‘Looking for Bugs in All the Right Places’. In Proceedings of the 2006 International Symposium on Software Testing and Analysis - ISSTA’06, 61. Portland, Maine, USA: ACM Press. https://doi.org/10.1145/1146238.1146246.

Using Software Dependencies and Churn Metrics to Predict Field Failures: An Empirical Case Study (Nagappan and Ball 2007)

Nagappan, Nachiappan, and Thomas Ball. 2007. ‘Using Software Dependencies and Churn Metrics to Predict Field Failures: An Empirical Case Study’. In First International Symposium on Empirical Software Engineering and Measurement (ESEM 2007), 364–73. Madrid, Spain: IEEE. https://doi.org/10.1109/ESEM.2007.13.

Automating Algorithms for the Identification of Fault-Prone Files (Ostrand, Weyuker, and Bell 2007)

Ostrand, Thomas J., Elaine J. Weyuker, and Robert M. Bell. 2007. ‘Automating Algorithms for the Identification of Fault-Prone Files’. In Proceedings of the 2007 International Symposium on Software Testing and Analysis - ISSTA ’07, 219. London, United Kingdom: ACM Press. https://doi.org/10.1145/1273463.1273493.

Using Developer Information as a Factor for Fault Prediction (Weyuker, Ostrand, and Bell 2007)

Weyuker, Elaine J., Thomas J. Ostrand, and Robert M. Bell. 2007. ‘Using Developer Information as a Factor for Fault Prediction’. In Third International Workshop on Predictor Models in Software Engineering (PROMISE’07: ICSE Workshops 2007), 8–8. Minneapolis, MN, USA: IEEE. https://doi.org/10.1109/PROMISE.2007.14.

A Metric for Software Readability (Buse and Weimer 2008)

Buse, Raymond P.L., and Westley R. Weimer. 2008. ‘A Metric for Software Readability’. In Proceedings of the 2008 International Symposium on Software Testing and Analysis - ISSTA ’08, 121. Seattle, WA, USA: ACM Press. https://doi.org/10.1145/1390630.1390647.

Reading Beside the Lines: Indentation as a Proxy for Complexity Metric (Hindle, Godfrey, and Holt 2008)

Hindle, Abram, Michael W. Godfrey, and Richard C. Holt. 2008. ‘Reading Beside the Lines: Indentation as a Proxy for Complexity Metric’. In 2008 16th IEEE International Conference on Program Comprehension, 133–42. https://doi.org/10.1109/ICPC.2008.13.

The Influence of Organizational Structure on Software Quality: An Empirical Case Study (Nagappan, Murphy, and Basili 2008)

Nagappan, Nachiappan, Brendan Murphy, and Victor Basili. 2008. ‘The Influence of Organizational Structure on Software Quality: An Empirical Case Study’. In Proceedings of the 30th International Conference on Software Engineering, 521–530. ICSE ’08. New York, NY, USA: Association for Computing Machinery. https://doi.org/10.1145/1368088.1368160.

Do Too Many Cooks Spoil the Broth? Using the Number of Developers to Enhance Defect Prediction Models (Weyuker, Ostrand, and Bell 2008)

Weyuker, Elaine J., Thomas J. Ostrand, and Robert M. Bell. 2008. ‘Do Too Many Cooks Spoil the Broth? Using the Number of Developers to Enhance Defect Prediction Models’. Empirical Software Engineering 13 (5): 539–59. https://doi.org/10.1007/s10664-008-9082-8.

On the Relationship Between Change Coupling and Software Defects (D’Ambros, Lanza, and Robbes 2009)

D’Ambros, Marco, Michele Lanza, and Romain Robbes. 2009. ‘On the Relationship Between Change Coupling and Software Defects’. In 2009 16th Working Conference on Reverse Engineering, 135–44. Lille, France: IEEE. https://doi.org/10.1109/WCRE.2009.19.

Predicting Faults Using the Complexity of Code Changes (Hassan 2009)

Hassan, Ahmed E. 2009. ‘Predicting Faults Using the Complexity of Code Changes’. In 2009 IEEE 31st International Conference on Software Engineering, 78–88. Vancouver, BC, Canada: IEEE. https://doi.org/10.1109/ICSE.2009.5070510.

Cross-Project Defect Prediction: A Large Scale Experiment on Data vs. Domain vs. Process (Zimmermann et al. 2009)

Zimmermann, Thomas, Nachiappan Nagappan, Harald Gall, Emanuel Giger, and Brendan Murphy. 2009. ‘Cross-Project Defect Prediction: A Large Scale Experiment on Data vs. Domain vs. Process’. In Proceedings of the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on The Foundations of Software Engineering on European Software Engineering Conference and Foundations of Software Engineering Symposium - ESEC/FSE ’09, 91. Amsterdam, The Netherlands: ACM Press. https://doi.org/10.1145/1595696.1595713.

Comparing the Effectiveness of Several Modeling Methods for Fault Prediction (Weyuker, Ostrand, and Bell 2010)

Weyuker, Elaine J., Thomas J. Ostrand, and Robert M. Bell. 2010. ‘Comparing the Effectiveness of Several Modeling Methods for Fault Prediction’. Empirical Software Engineering 15 (3): 277–95. https://doi.org/10.1007/s10664-009-9111-2.

Does Measuring Code Change Improve Fault Prediction? (Bell, Ostrand, and Weyuker 2011)

Bell, Robert M., Thomas J. Ostrand, and Elaine J. Weyuker. 2011. ‘Does Measuring Code Change Improve Fault Prediction?’ In Proceedings of the 7th International Conference on Predictive Models in Software Engineering - Promise ’11, 1–8. Banff, Alberta, Canada: ACM Press. https://doi.org/10.1145/2020390.2020392.

Don’t Touch My Code!: Examining the Effects of Ownership on Software Quality (Bird et al. 2011)

Bird, Christian, Nachiappan Nagappan, Brendan Murphy, Harald Gall, and Premkumar Devanbu. 2011. ‘Don’t Touch My Code!: Examining the Effects of Ownership on Software Quality’. In Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering - SIGSOFT/FSE ’11, 4. Szeged, Hungary: ACM Press. https://doi.org/10.1145/2025113.2025119.

Non-Essential Changes in Version Histories (Kawrykow and Robillard 2011)

Kawrykow, David, and Martin P. Robillard. 2011. ‘Non-Essential Changes in Version Histories’. In Proceeding of the 33rd International Conference on Software Engineering - ICSE ’11, 351. Waikiki, Honolulu, HI, USA: ACM Press. https://doi.org/10.1145/1985793.1985842.

Evaluating Complexity, Code Churn, and Developer Activity Metrics as Indicators of Software Vulnerabilities (Shin et al. 2011)

Shin, Yonghee, Andrew Meneely, Laurie Williams, and Jason A. Osborne. 2011. ‘Evaluating Complexity, Code Churn, and Developer Activity Metrics as Indicators of Software Vulnerabilities’. IEEE Transactions on Software Engineering 37 (6): 772–87. https://doi.org/10.1109/TSE.2010.81.

An Empirical Study on the Impact of Duplicate Code (Hotta et al. 2012)

Hotta, Keisuke, Yui Sasaki, Yukiko Sano, Yoshiki Higo, and Shinji Kusumoto. 2012. ‘An Empirical Study on the Impact of Duplicate Code’. Advances in Software Engineering 2012: 1–22. https://doi.org/10.1155/2012/938296.

Code Smells Quantification: A Case Study On Large Open Source Research Codebase (Chauhan 2019)

Chauhan, Swapnil. 2019. ‘Code Smells Quantification: A Case Study On Large Open Source Research Codebase’. Open Access Theses & Dissertations, January. https://scholarworks.utep.edu/open_etd/50.

Attitudes, Beliefs, and Development Data Concerning Agile Software Development Practices (Matthies et al. 2019)

Matthies, Christoph, Johannes Huegle, Tobias Durschmid, and Ralf Teusner. 2019. ‘Attitudes, Beliefs, and Development Data Concerning Agile Software Development Practices’. In 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering Education and Training (ICSE-SEET), 158–69. Montreal, QC, Canada: IEEE. https://doi.org/10.1109/ICSE-SEET.2019.00025.

Source Code Properties of Defective Infrastructure as Code Scripts (Rahman and Williams 2019)

Rahman, Akond, and Laurie Williams. 2019. ‘Source Code Properties of Defective Infrastructure as Code Scripts’. Information and Software Technology 112 (August): 148–63. https://doi.org/10.1016/j.infsof.2019.04.013.

Papers on Voronoi treemaps

Computing Voronoi Treemaps: Faster, Simpler, and Resolution-Independent (Nocaj and Brandes 2012)

Nocaj, Arlind, and Ulrik Brandes. 2012. ‘Computing Voronoi Treemaps: Faster, Simpler, and Resolution-Independent’. Computer Graphics Forum 31 (3pt1): 855–64. https://doi.org/10.1111/j.1467-8659.2012.03078.x.

Still to find and read

"Code smells for multi-language systems" paper from europlop 19

"visual detection of design anomalies" - refers to LSP05

Edit this page on GitHub

polyglot-tools-docs

Bibliography

Bibliography

A note on research limitations

Books

Your Code as a Crime Scene (Tornhill 2015)

Making Software: What Really Works, and Why We Believe It (Oram and Wilson 2010)

Papers on Metrics

Predicting Fault Incidence Using Software Change History (Graves et al. 2000)

Does Code Decay? Assessing the Evidence from Change Management Data (Eick et al. 2001)

Where the Bugs Are (Ostrand, Weyuker, and Bell 2004)

Predicting Source Code Changes by Mining Change History (Ying et al. 2004)

Use of Relative Code Churn Measures to Predict System Defect Density (Nagappan and Ball 2005)

Looking for Bugs in All the Right Places (Bell, Ostrand, and Weyuker 2006)

Using Software Dependencies and Churn Metrics to Predict Field Failures: An Empirical Case Study (Nagappan and Ball 2007)

Automating Algorithms for the Identification of Fault-Prone Files (Ostrand, Weyuker, and Bell 2007)

Using Developer Information as a Factor for Fault Prediction (Weyuker, Ostrand, and Bell 2007)

A Metric for Software Readability (Buse and Weimer 2008)

Reading Beside the Lines: Indentation as a Proxy for Complexity Metric (Hindle, Godfrey, and Holt 2008)

The Influence of Organizational Structure on Software Quality: An Empirical Case Study (Nagappan, Murphy, and Basili 2008)

Do Too Many Cooks Spoil the Broth? Using the Number of Developers to Enhance Defect Prediction Models (Weyuker, Ostrand, and Bell 2008)

On the Relationship Between Change Coupling and Software Defects (D’Ambros, Lanza, and Robbes 2009)

Predicting Faults Using the Complexity of Code Changes (Hassan 2009)

Cross-Project Defect Prediction: A Large Scale Experiment on Data vs. Domain vs. Process (Zimmermann et al. 2009)

Comparing the Effectiveness of Several Modeling Methods for Fault Prediction (Weyuker, Ostrand, and Bell 2010)

Does Measuring Code Change Improve Fault Prediction? (Bell, Ostrand, and Weyuker 2011)

Don’t Touch My Code!: Examining the Effects of Ownership on Software Quality (Bird et al. 2011)

Non-Essential Changes in Version Histories (Kawrykow and Robillard 2011)

Evaluating Complexity, Code Churn, and Developer Activity Metrics as Indicators of Software Vulnerabilities (Shin et al. 2011)

An Empirical Study on the Impact of Duplicate Code (Hotta et al. 2012)

Code Smells Quantification: A Case Study On Large Open Source Research Codebase (Chauhan 2019)

Attitudes, Beliefs, and Development Data Concerning Agile Software Development Practices (Matthies et al. 2019)

Source Code Properties of Defective Infrastructure as Code Scripts (Rahman and Williams 2019)

Papers on Voronoi treemaps

Computing Voronoi Treemaps: Faster, Simpler, and Resolution-Independent (Nocaj and Brandes 2012)

Still to find and read

Data Format

Future plans

On this page