Developer Focus: Lack of Impact on Maintainability

. We were looking for evidence that a connection between source code quality erosion and the developer focus exists. We assumed that more focused developers, i.e. those who are concerned with a well speci(cid:28)ed part of the source code at a time are likely to commit higher quality code compared to those who are less focused, i.e. committing to various parts of the code. We estimated code quality with the Colum-busQM quality model and developer focus with structural scattering. Despite the assumption sounds quite logical, we could not (cid:28)nd any supporting evidence. As structural scattering assigns a measure to a set of source (cid:28)les/classes (i.e., how close they are to each other in the package hierarchy), we could apply it in various ways. First, we de(cid:28)ned developer focus to be the structural scattering of the set of source (cid:28)les in a commit to validate if more focused changes have better impact on maintainability than less focused ones. Second, we calculated the structural scattering of all the (cid:28)les the developer of a commit modi(cid:28)ed in the last 3 months and assigned this measure as the developer focus to that commit. With this test we checked if more focused developers tend to commit better quality code, compared to less focused ones. We also performed this test for every developer separately, considering only the subset of the commits that were created by that particular developer. We calculated the level of developer focus and the maintainability changes for every commit of three open-source and one proprietary software sys-tem. With the help of Wilcoxon rank test we compared the focus values of commits causing a maintainability increase with those of decreasing the maintainability. The results are non-conclusive, they do not even tend to the same direction, therefore we did not (cid:28)nd any evidence of an existing connection between maintainability and developer focus. Therefore this is a publication of negative results.


Introduction
Maintenance of the software consumes big eorts, high proportion of the total amount of software development costs are spent on this activity. Source code maintainability is in direct connection with maintenance costs [2]. Our motivation in this work was to investigate the eect of the development process on the maintainability of the code. Our goal was to explore typical patterns causing similar changes in software maintainability, which could either help to avoid software erosion, or provide information about how to better allocate eorts to improve software maintainability.
We already investigated this area of research in previous works [611]. In article [11] we showed that a strong connection between the version control operations and the maintainability of the source code exists. In study [7] we revealed the connection between the version control operations and maintainability. We found that le additions have rather positive and le updates have negative eect on maintainability. A clear eect of le deletions was not identied. In article [6] we presented the results of a variance analysis. We found that le additions and le deletions increase the variance of the maintainability, and operation Update decreases it. In work [9] we analyzed code churn, i.e. the intensity of past modications. We found that modifying high-churn code is more likely to decrease the overall maintainability of a software system. In study [8] we considered the developer and investigated how code ownership impacts maintainability. We concluded that common code is more likely to erode further than code with clear ownership. In article [10] we dened a few version control history metrics and checked their connection with maintainability. Our tests resulted that higher intensity of modications, the higher number of code modications and developers, the older code and the later last modication date have lower maintainability and higher number of post-release bugs.
Up to now we published in this area of research positive results only. But we think that publishing negative results is also very important; a negative result can also be very helpful. In this paper presents the negative results of a study performed but never published as part of our research investigating how code ownership impacts maintainability [8]. That paper was motivated by works of others [3,4,21] revealing an increased bug-prediction capability of models with some form of code ownership included as a predictor. We could conrm that clear ownership has a positive eect on code maintainability (measured by the combination of well-known source code metrics) as well. This result showed that common code is more likely to erode than code with clear ownership. This complements and generalizes bug-prediction studies on this topic. As part of this work, we wanted to show that the focus of developers also has a signicant impact on software maintainability. This assumption was inspired by Di Nucci et al. [5] who investigated the impact of developer focus on post-release bugs. They found that a bug prediction model including the structural and semantic scattering metrics, which measure how close the modied code parts are structurally and conceptually overperforms the models not using these indicators.
We dened developer focus based on their denition of structural scattering [5], which is the distance between the les the developers work with at a time: the number of steps needed to take from the package of one le to another.
The closest ones having distance 0 are those located in the same package. The focus of a set of source les is the normalized aggregation of pairwise distances.
The study of developer focus that is developer oriented (i.e., investigates the effect of what else the developer modied) would complement the code ownership study [8], which took a source code oriented approach (i.e., studies the eect of who else modied the same source code).
Formally, we investigated the following research questions: RQ1: Do more focused commits (i.e., commits aecting a set of source les with low scattering value) have better impact on maintainability compared to those of less focused commits? RQ2: Do developers who were more focused in the past (i.e., the scattering value of the set of les they changed in the past 3 months are low) tend to commit more maintainable code, compared to less focused ones?
To answer these questions we studied the code change history of three opensource systems and an industrial one. Our null-hypothesis was that there is no connection between developer focus and maintainability of the source code.
Based on the statistical tests, unfortunately, we could not reject the null hypothesis, therefore we could not report evidence that developer focus impacts code maintainability.
All data for replicating our study is available as an online appendix at: http://www.inf.u-szeged.hu/~ferenc/papers/DeveloperFocus/ The remaining of the paper is organized as follows. Section 2 provides a brief overview of works that are related to this research. In Section 3 we present how we collected the data and what kinds of tests we performed. In Section 4 we present the results of the statistical tests. We conclude the paper in Section 5.

Related Work
Code ownership, which is very close to the topic of this paper, is widely investigated. The results are very contradictory: some researcher nd signicant correlation between code ownership and code quality, which others do not.
Nordberg and Martin [18] describe in their study four types of code ownership: product specialist, subsystem ownership, chief architect and collective ownership. They discuss the advantages and disadvantages of each model. LaToza et al. performed two surveys and eleven interviews, conducted by software developers at Microsoft, regarding software development questions, and presented the results in article [17]. Some of the questions were related to code ownership, strongly related to this study. The authors formed an interesting statement: the code ownership can also be wrong, as if a code is understood and maintained by a single developer, it makes individuals too indispensable.
As an alternative of individual code ownership, they investigated the team code ownership. This topic would be interesting also for us: to somehow determine teams and dene team level code ownership and team level focus instead of individual level ones.
In their work Fritz et al. [13] investigated the frequency and the elapsed time of interactions on the code by developers. They asked questions to nd out if the developers can recall details about the source code: types of variables, types of parameters, method names, another method calls and methods which calls a specied method. The results supported their assumed hypothesis, the developers know their code that was modied by him/her frequently and recently better compared to foreign code. This study is strongly related to code ownership and developer focus, as a more focused developer might better recall source code elements regarding to the related code, compared to less focused developers.
Weyuker et al. [21] tried to enhance their defect prediction model by including the number of developers. They found that the achieved improvement is negligible, which is similar to the results we present in this study.
The same authors (Bell et al. [3]) tried to improve their defect prediction model considering the individual developers: they investigated whether les in a large system that are modied by an individual developer consistently contain either more or fewer faults than the average of all les in the system. They found negligible improvement: the study indicates that adding information to a model about which particular developer modied a le is not likely to improve defect predictions.
Hattori et al. [16] analyzed the problem of code ownership, especially nding the hidden co-authors. They considered in their model the developer interaction information as well. This could lead to a ner result also in case of developer focus.
The study by Bird et al. [4] investigated the eects of ownership on software quality. Under term of software quality they considered pre-release faults and post-release failures. They performed the analysis on binary and release level of the source code of Windows Vista and Windows 7. For a binary they dened the terms minor contributor (developers who contributed at most 5% of the total commits), major contributor (above 5%) and ownership (proportion of the commits of the highest contributor). Among others, they found that software components with many minor contributors had more failures than other software components. Moreover, the high level of ownership resulted in less defects.
Rahman et al. [20] introduced a code ownership and experience based defect prediction model, but instead of just considering the modications performed on source le itself, they introduced a ne-grained level by analyzing the contributions to code fragments. This approach could be a good direction for future investigation of developer focus as well: besides considering source code package, other source code elements like class or function could also be taken into account.
Greiler et al. [14] dened several contributor-related metrics and used those for defect predicting model. Their ndings conrm the original nding by Bird et al. [4] that code ownership correlates with code quality.
On the other hand, Foucault et al. [12] performed similar study as Bird et al. [4] on open-source systems, but they found that the relationship between ownership metrics and module faults is weak. The performed an in-depth anal- Di Nucci et al. [5] investigated the impact of developer focus on post-release bugs. They dened structural and semantic scattering (we implemented our developer focus value based on their structural scattering denition), and found that a bug predicting model including these metrics over performs models without these.
We also experienced these contradictory results. Our earlier model [8] considering the number of developers resulted in a not too strong but still signicant result, but this study, considering developer focus did not show signicant correlation at all.

Overview
As we wanted to analyze the connection between developer focus and maintainability, we had to nd a method to express these numerically. Neither of them are trivial concepts, and currently there are no exact denitions on how to compute them.
Considering maintainability, we performed the same calculation method that we applied in our previous studies [611]. We present the maintainability estimation method in detail in Section 3.3. The used quality model is capable of analyzing certain revisions of a system, therefore we chose to work on a per commit basis.
We calculated the developer focus values based on the denition of structural scattering by Di Nucci et al. [5]. We present the calculation details in Section 3.4. Section 3.5 describes the statistical tests we used to analyze the data. In Section 3.6, we explain our decisions made during the elaboration of the methodology.

Preliminary Steps
As rst step we did some data cleaning. The analyzed software systems were all written in Java. As the used quality model considers Java source les only, we removed the non Java source les (e.g., xml les) from the input. If a commit contained non Java les only, then we also removed that one. So we worked on an input commit set that contained Java source les exclusively. Furthermore, each analyzed revision contained at least one aected Java le.

Estimation of the Maintainability Change
We used the ColumbusQM [1] probabilistic software quality model for estimating the maintainability value of every revision. It considers the following source code metrics: logical lines of code, the number of ancestors, the maximum nesting level, the coupling between object classes, clone coverage, number of parameters, McCabe's cyclomatic complexity, number of incoming invocations, number of outgoing invocations, and number of coding rule violations. The basis of this model is the fact that there is a negative correlation between these metrics and software maintainability [15]. The quality mode compares these metrics of the analyzed system with those of other systems in a benchmark, and then aggretages the results of the comparisons by utilizing weights provided by developers.
From the study's viewpoint we treat this quality model as a black box. Details of this model is described the work of Bakota et al. [1]. The authors validated the model and they also revealed the correlation between the estimated quality value and the development costs [2]. The quality model results a real number between 0 and 1; better maintainability is indicated by higher value.
For each analyzed systems we calculated the maintainability values for every revision available in their version control systems. As next step we calculated the dierence of the maintainability values of subsequent revisions, and then considered the sign of the result: positive, zero, or negative, indicating if the actual commit increased, did not considerably change or decreased the maintainability of the source code, respectively.

Calculation of Developer Focus
For calculating the focus of developers, we adopted the denition of structural scattering, described by Di Nucci et al. (see [5], page 243).
Let CH d,p be the the set of classes changed by a developer d during a time period p. The authors dened the structural scattering measure as: where dist is the number of steps to be taken in order to go from class c i to class c j . For example, the dist between classes pkg.entities.User and pkg.logic.util.Convert is 3: pkg.entities → pkg → pkg.logic → pkg. logic.util. The multiplication factor at the beginning of the formula normalizes the distances between the code components and assigns a higher scattering to developers working on a higher number of code components in the given time period.

Comparison Tests
Once we had a maintainability change direction and a developer focus value for every commit in the revision history, we could check if there is any connection between the maintainability change direction and the focus of the developers. The null hypothesis was that there is no signicant dierence between these values. The alternative hypothesis was that the developer focus values related to commits with positive maintainability changes are signicantly lower than those related to negative maintainability changes, meaning that more focused commits (i.e., commits with low scattering value) are more likely to increase the maintainability.
We performed the Wilcoxon rank correlation test on the data, as this one is suitable for kind data we have (e.g., it is not normally distributed, or it has outliers). This test compares all the elements of the rst data set with all the elements of the other one, taking all the possible combinations into consideration.
According to our null hypothesis the number of greater elements should be roughly the same as the number of less elements. The alternative hypothesis expresses that the elements of one of the sets should be signicantly higher than the elements of the other.
We used the R statistical program [19] for performing the tests, using the wilcox.test() function. As a result, we got p-values for all software systems we performed the test on. We ran tests with 3 dierent setups: 1) for answering RQ1; 2) and 3) for answering RQ2: Commit-based Comparison Tests In this case we used every commit for the developer focus calculation, and calculated the focus value considering all the les aected by that commit, as described in Section 3.4. As a result we got developer focus values for every commit.
Developer-based Comparison Tests In this case rst we calculated a running developer focus value for every developer. For every commit we considered the developer who performed that commit and took all the les the developer changed in the previous 3 months. Therefore we simulated the process of forgetting. Furthermore, we omitted the commits containing more than 20 les, because that would have caused a big bias. For example, a directory rename could aect a large amount of source les, which would drastically increase the focus value of that developer for the next 3 months, but in the reality that developer has not lost the focus.
We dened the commit related focus value to be the focus value of the actual developer who performed the commit.
Individual-based Comparison Tests In this case we considered the version control history individually for every developer, considering commits performed only by that developer. This results in the same number of version control subhistories as many developers contributed to the source code. We applied the methodology in every case as described in the developer-based comparison tests.
In order to avoid the non-explanatory results we excluded the developers who contributed too little, i.e. whose sub-history was too short to analyze. We considered only those developers who contributed at least 5 commits resulting a maintainability increase and at least 5 commits resulting a maintainability decrease.

Discussion
The commit-based comparison test is a rough approach, however our expectation was that if connection between maintainability change and developer focus existed, this could have been observed even by this method. In this case we did not consider earlier modications of the developer, just the actual commit (i.e., we did not consider past focus).
On the other hand, developer based comparison tests are more ne-grained, and we thought of this as the main outcome of our study. The focus value calculation would be the same as in the commit-based case if we considered the union of all the contributions of the actual developer in the past 3 months (with the exception of huge commits). Our expectation was that considering this focus value the comparison tests would yield signicant results.
In case of individual-based comparison tests we practically sliced the whole version control history to as many pieces as many developers contributed to that project, and kept only those that contained enough number of commits.
This resulted in several tests per analyzed system (i.e., a separate test for every developer). With this test we wanted to ensure that we nd per developer patterns even if we cannot nd a general connection between developer focus and maintainability.

Examined Software Systems
We executed the tests on four independent software systems. Our selection criteria for the subject systems were the following: availability of at least 1000 commits and at least 200% code increase during the analyzed period. We performed the analysis of the following 4 systems: Ant a command line tool for building Java applications (http://ant. apache.org). All together 37 developers contributed at least once. The total number of available commits was more than 6000.
Struts 2 a framework for creating enterprise-ready Java web applications (http://struts.apache.org/). The number of developers was 26.
Tomcat an implementation of the Java Servlet and Java Server Pages technologies http://tomcat.apache.org). 15  A bug in our software could also lead to negative results. To minimize the risk of this, we added unit tests to our implementation. Furthermore, we played around with dierent forgetting intervals (instead of the actually used

Conclusions and Future Work
As we already stressed in the introduction this is a publication of negative results.
According to the statistic tests we performed we could not reject any of the nullhypotheses, either that more focused developers tend to contribute better quality code, or that more focused commits tend to improve code quality.