Navigation
Abstract
Coordination among developers is crucial in large-scale open-source software projects, where developers are often distributed across the entire planet. By assessing the alignment of collaboration and communication in such software projects in terms of coordination requirements, we can estimate whether a state of socio-technical congruence is achieved, which is associated with software quality and project success. By means of an empirical study on a substantial set of large-scale open-source software projects (including OpenSSL, Git, and LLVM)—altogether making up over 180 years of development history—we aim at shedding light at this issue. Compared to the state of the art in this research area, we do not only identify coordination requirements arising from files and functions only, but also those arising from features. This way, we take a more semantic view on this phenomenon. We found that open-source developers fulfill coordination requirements on purpose, but mostly those coordination requirements arising from coupled source-code artifacts, while they resolve simpler ones independently. Furthermore, we found that neither of the considered abstraction levels of source-code artifacts (files, functions, features) is more suitable as constructional argument for coordination requirements with respect to their fulfillment. This finding strongly indicates that features do not play an as important role in the development process as expected and commonly believed by the research community in the area of feature-oriented and feature-driven development. Finally, we identified notable evolutionary trends in the fulfillment of coordination requirements and showed that far-reaching social events have a huge impact on their fulfillment, both negatively and positively. The key findings of our empirical study are that socio-technical relations are important to understand open-source development communities and that the incorporation of different abstraction levels for developer collaboration does yield important insights to further improve the evolution in open-source software projects.
Keywords: coordination requirements socio-technical congruence social-network analysis coronet Codeface open-source software systems configurable systems software product lines feature-oriented software development
Research Questions and Hypotheses
Research Questions
- RQ1
- Does developer communication align with artifact-based coordination requirements in real-world OSS projects such that the coordination requirements are fulfilled?
- RQ2
- Does developer communication align better with feature-based coordination requirements than with function-based or file-based coordination requirements?
- RQ3
- Does the degree of fulfillment of coordination requirements change for different artifact types during project evolution??
Hypotheses
- RH1
- A high number of developer pairs collaborating on the same artifacts do exchange e-mails on the same threads of the mailing list, such that coordination requirements arising from any type of artifact are fulfilled not only by chance.
- RH2
- The fraction of fulfilled coordination requirements is lower for the square motif than for the triangle motif, independent of the observed artifact abstraction.
- RH3
- The fraction of fulfilled coordination requirements differs for the different artifacts, and is significantly highest for abstraction level of features.
- RH4
- In later stages of development, the fraction of fulfilled coordination requirements is higher than in earlier stages for all motifs and artifacts.
Network Approach
Coordination-Requirement Networks and Motifs
To analyze the fulfillment of coordination requirements in a software project, we construct coordination-requirement networks, which we can analyze with network-analytic methods. We show an exemplary coordination-requirement network in Figure 1. Formally, such a network is defined as an undirected graph G = (D ∪ A, E), where we encode developers (, set D) and artifacts (, set A) as vertices; E is the set of edges among the vertices. We encode the following three relations in the edges:
- Developer–artifact relation (): Developers work on code artifacts while committing to the project’s version control system. Such artifacts may be files, functions, or features (which may crosscut the file and function decomposition).
- Artifact–artifact relation (): Artifacts can be related in various ways, giving raise to interdependencies. We consider co-changes to describe logical coupling among artifacts. The term “co-changes” refers to artifacts that are concurrently changed in a single commit in the project’s version control system.
- Developer–developer relation (): We consider contributions of the developers to their project’s mailing list: In line with the seminal work by Bird et al., we assume that two developers coordinate their work iff they contribute to the same thread on the mailing list [8].
To automatically identify coordination requirements in coordination-requirement networks, we encode coordination requirements or, rather, the patterns they represent as network motifs. Network motifs are recurrent sub-graphs in a given network [75]. Motifs can be described formally as a set of vertices (e.g., {d1, d2, a1}) with specific edges connecting them. We show two network motifs for coordination requirements in Figure 2, the triangle motif and the square motif.
Abstraction Levels for Source-Code Artifacts
In previous work [14, 13], researchers have tracked concurrent contributions of developers on the same file to derive coordination requirements. We conjecture that this view may be too technical to capture the richness of coordination. Thus, we introduce the two code-artifact abstractions function and feature to infer coordination requirements at different levels of abstraction. Although these abstraction levels are based on heuristics, they have been shown to be reliable in multiple previous studies [51, 55, 43, 44]. Additionally, with regard to network constructions, the abstraction file has been shown to produce dense networks that are known to hinder community detection [10, 43] and, thus, represent more precise coordination relations among developers. It has been already shown that a function-level view is more accurate [44].
For illustration, we show how strongly the choice of abstraction level influences the extraction of coordination requirements by means of the triangle motif and its manifestation in the source-code excerpt listed in Figure 3 (show/hide). We show the resulting coordination-requirement networks in Figure 4.
Statistics and Formulas
To analyze the alignment of the email-based developer coordination and the actual artifact-based collaboration, we measure the fraction of fulfilled coordination requirements. Given a coordination-requirement network constructed using one type of code artifact (i.e., file, function, or feature) and a motif m to identify coordination requirements, we define the fraction fraccr(a, m) of fulfilled coordination requirements as follows:
fraccr(a, m) = |crfull(a, m)| / |crfound(a, m)|, where
crfound(a, m) = { c | matched instance c of motif m for artifact a in the current network } and
crfull(a, m) = { cf | cf ∈ crfound(a, m), cf is fulfilled },
Tools
For data extraction, we mainly use the tool Codeface. Based on the Codeface results, we construct and analyze coordination-requirement networks using our network-construction library coronet and a set of self-written R scripts. Our script setup is available in the Downloads section.
Codeface
Codeface is a framework and interactive web frontend for the social and technical analysis of software development projects.
coronet
coronet is a library to construct socio-technical developer networks based on various data sources in a configurable and reproducible way.
R scripts
We developed a set of R scripts on top of Codeface and coronet for our analysis, which are available in the Downloads section.
Subject Systems
Project | Time | # Commits | # E-Mails | # Developers |
---|---|---|---|---|
Apache HTTP | 1996–2017 | 29704 | 54921 | 2146 |
BusyBox | 1999–2016 | 14313 | 42013 | 2736 |
FFmpeg | 2000–2017 | 80605 | 242295 | 5998 |
Git | 2005–2017 | 34898 | 338500 | 9246 |
LLVM | 2001–2017 | 158562 | 706716 | 6407 |
OpenSSL | 1998–2016 | 18143 | 32659 | 4786 |
PostgreSQL | 1996–2017 | 44062 | 320711 | 4647 |
QEMU | 2003–2016 | 46633 | 430561 | 7205 |
U-Boot | 1988–2017 | 44736 | 319160 | 7924 |
Wine | 1993–2017 | 121815 | 111333 | 4087 |
Results
Overview on Subject Systems
Project | Artifact | # Artifacts | Triangle motif m▵ |
Square motif m□ | ||||
---|---|---|---|---|---|---|---|---|
crfound | crfull | fraccr | crfound | crfull | fraccr | |||
Apache HTTP | file | 1366 | 5710 | 2818 | 0.49 | 316417 | 134993 | 0.43 |
function | 16869 | 2830 | 1612 | 0.57 | 173364 | 121281 | 0.70 | |
feature | 1357 | 654 | 322 | 0.49 | 7224 | 4380 | 0.61 | |
BusyBox | file | 1370 | 2021 | 907 | 0.45 | 181236 | 69771 | 0.38 |
function | 10942 | 1661 | 756 | 0.46 | 164853 | 29122 | 0.18 | |
feature | 2534 | 670 | 312 | 0.47 | 18384 | 3407 | 0.19 | |
FFmpeg | file | 3257 | 42978 | 18277 | 0.43 | 4030726 | 1886168 | 0.47 |
function | 33078 | 19303 | 8658 | 0.45 | 1154399 | 438927 | 0.38 | |
feature | 2079 | 5742 | 2369 | 0.41 | 110 | 51972 | 0.47 | |
Git | file | 1740 | 13690 | 4798 | 0.35 | 240103 | 77443 | 0.32 |
function | 11937 | 7151 | 2473 | 0.35 | 128572 | 39928 | 0.31 | |
feature | 175 | 185 | 119 | 0.64 | 209 | 177 | 0.85 | |
LLVM | file | 5619 | 119428 | 42219 | 0.35 | 9974743 | 3832917 | 0.38 |
function | 50201 | 17647 | 8457 | 0.48 | 7912 | 571735 | 0.57 | |
feature | 937 | 8495 | 2432 | 0.29 | 26508 | 9420 | 0.36 | |
OpenSSL | file | 1444 | 5295 | 1922 | 0.36 | 712756 | 307113 | 0.43 |
function | 12941 | 3044 | 1195 | 0.39 | 183091 | 88851 | 0.49 | |
feature | 1132 | 3445 | 1134 | 0.33 | 153236 | 68464 | 0.45 | |
PostgreSQL | file | 2192 | 21798 | 16671 | 0.76 | 5147123 | 4210 | 0.82 |
function | 34960 | 14237 | 10797 | 0.76 | 3193322 | 2551078 | 0.80 | |
feature | 1061 | 1764 | 1339 | 0.76 | 19753 | 13158 | 0.67 | |
QEMU | file | 3227 | 36467 | 23096 | 0.63 | 3466452 | 2486040 | 0.72 |
function | 57955 | 15394 | 10365 | 0.67 | 1166730 | 810775 | 0.69 | |
feature | 1753 | 13892 | 5433 | 0.39 | 156289 | 59658 | 0.38 | |
U-Boot | file | 8257 | 11096 | 5755 | 0.52 | 307816 | 185906 | 0.60 |
function | 63067 | 4664 | 2680 | 0.57 | 162257 | 415 | 0.62 | |
feature | 7065 | 20711 | 6931 | 0.33 | 423328 | 147434 | 0.35 | |
Wine | file | 5568 | 45088 | 19 | 0.44 | 1463843 | 817654 | 0.56 |
function | 164073 | 23665 | 12506 | 0.53 | 1431211 | 818966 | 0.57 | |
feature | 1687 | 6348 | 2328 | 0.37 | 32501 | 12884 | 0.40 |
Project | # Commits | # Commits per developer | |||
---|---|---|---|---|---|
Avg. ± Std. dev. | Median | .8 quantile | Max. | ||
Apache HTTP | 29671 | 237.37 ± 417.48 | 72.00 | 360.00 | 2452 |
BusyBox | 14259 | 53.01 ± 433.51 | 1.00 | 5.00 | 6495 |
FFmpeg | 80535 | 52.84 ± 560.22 | 2.00 | 9.00 | 19516 |
Git | 34872 | 23.50 ± 159.14 | 2.00 | 10.00 | 3989 |
LLVM | 158519 | 183.47 ± 1023.25 | 17.00 | 104.00 | 26580 |
OpenSSL | 18077 | 62.33 ± 388.51 | 1.00 | 5.00 | 4535 |
PostgreSQL | 44010 | 1047.86 ± 2821.8 | 166.50 | 778.60 | 13327 |
QEMU | 46578 | 43.82 ± 188.6 | 3.00 | 18.00 | 2505 |
U-Boot | 44680 | 27.48 ± 146.9 | 3.00 | 18.00 | 4154 |
Wine | 121731 | 79.82 ± 532.83 | 2.00 | 17.20 | 14089 |
Project | Artifact | # Commits per artifact | |||
---|---|---|---|---|---|
Avg. ± Std. dev. | Median | .8 quantile | Max. | ||
Apache HTTP | file | 16.37 ± 43.77 | 1.00 | 17.00 | 469 |
function | 3.08 ± 40.53 | 1.00 | 3.00 | 5227 | |
feature | 8.18 ± 321.1 | 1.00 | 3.00 | 21200 | |
BusyBox | file | 20.81 ± 39.83 | 8.00 | 29.00 | 645 |
function | 4.96 ± 54.18 | 2.00 | 6.00 | 5620 | |
feature | 6.58 ± 201.26 | 2.00 | 5.00 | 17700 | |
FFmpeg | file | 26.81 ± 69.74 | 9.00 | 33.00 | 1873 |
function | 4.82 ± 114.61 | 2.00 | 5.00 | 20796 | |
feature | 24.07 ± 1323.09 | 2.00 | 5.00 | 97296 | |
Git | file | 18.24 ± 35.96 | 7.00 | 24.00 | 518 |
function | 4.59 ± 92.82 | 2.00 | 5.00 | 10127 | |
feature | 85.7 ± 1796.61 | 1.00 | 3.00 | 39118 | |
LLVM | file | 31.2 ± 98.95 | 6.00 | 32.00 | 4182 |
function | 4.55 ± 323.23 | 2.00 | 4.00 | 72385 | |
feature | 57.66 ± 3108.01 | 1.00 | 3.00 | 181534 | |
OpenSSL | file | 20.29 ± 32.48 | 11.00 | 28.00 | 392 |
function | 4.25 ± 40.11 | 2.00 | 5.00 | 4521 | |
feature | 8.24 ± 188.34 | 2.00 | 4.00 | 13572 | |
PostgreSQL | file | 38.37 ± 73.76 | 13.00 | 48.00 | 862 |
function | 5.25 ± 76.96 | 2.00 | 6.00 | 14305 | |
feature | 18.61 ± 740.11 | 2.00 | 5.00 | 43704 | |
QEMU | file | 22.26 ± 55.97 | 7.00 | 27.00 | 1401 |
function | 3.2 ± 79.15 | 2.00 | 4.00 | 19012 | |
feature | 16.82 ± 843.12 | 1.00 | 4.00 | 66572 | |
U-Boot | file | 6.91 ± 11.13 | 4.00 | 9.00 | 238 |
function | 2.11 ± 62.79 | 1.00 | 2.00 | 15762 | |
feature | 5.12 ± 257.7 | 1.00 | 3.00 | 41286 | |
Wine | file | 33.3 ± 68.9 | 12.00 | 45.00 | 1824 |
function | 3.65 ± 111.78 | 2.00 | 4.00 | 45243 | |
feature | 41.16 ± 2579.56 | 1.00 | 4.00 | 192364 |
Hypothesis RH1
stats/hypo1-collect/
(raw data, empirical and null model) and stats/hypo1/empirical/
(statistical tests).
There are both input and output data available, alongside with all plots presented in this section.
Paired Wilcoxon signed-rank test | ||
---|---|---|
H0: | fraccr(⏺, m▵) ≤ fraccrnull(⏺, m▵) | fraccr(⏺, m□) ≤ fraccrnull(⏺, m□) |
fraccr(file, ⏺) | W ≈ 54, p < 0.01*, δ = 0.42 | $W ≈ 48, p ≈ 0.02*, δ = 0.12 |
fraccr(function, ⏺) | W ≈ 53, p < 0.01*, δ = 0.48 | W ≈ 53, p < 0.01*, δ = 0.24 |
fraccr(feature, ⏺) | W ≈ 54, p < 0.01*, δ = 0.3 | W ≈ 51, p < 0.01*, δ = 0.26 |
W = test value W, p = p-value, * for p < 0.05, δ = Cliff's δ effect size |
Sensitivity Analysis following Kossinets (2006)
We performed a sensitivity analysis following Kossinets (2006) to investigate on the stability of our results. In detail, we used the simulation algorithm "BSPC" (boundary specification problem for contexts) to simulate the absence of coordination effort from the mailing list (which may occur on different platforms such as face-to-face meetings or chats instead) and, thus, incomplete information sources (i.e., mailing-list data) – similar to the null models (see Section 3.2.4). The algorithm removes a defined number of random e-mail threads before constructing analyzable coordination-requirement networks and calculates the metrics as previously defined. To this end, for the projects BusyBox, Git, LLVM, and OpenSSL, we randomly removed 10, 20, …, 90 percent of the e-mail threads, performed 25 iterations for better randomization, calculated mean values, and analyzed the final results. In short, we found for the selected projects that the removal of 10% of all e-mail threads produces a relative error of about 15% for fraccr, across all revision ranges and for all motifs and source-code artifacts. With 20% of all e-mail threads being randomly removed, the metric exhibits a relative error of about 25%, on average. Results can be opened/displayed in the table below. These results indicate that the absence of crucial developers may have an immediate and extensive effect on most projects and emphasize that any coordination effort is important to fulfill coordination requirements.
Project | Triangle motif m▵ | Square motif m□ |
---|---|---|
BusyBox | # | # |
Git | # | # |
LLVM | # | # |
OpenSSL | # | # |
Hypothesis RH2
stats/hypo2/empirical/
.
There are both input and output data available, alongside with all plots presented in this section.
Paired Wilcoxon signed-rank test | |
---|---|
H0: | fraccr(a, m▵) ≤ fraccrnull(a, m▵) |
fraccr(a, m▵) | |
fraccr(a, m□) | N = 30, W = 325, p ≈ 0.9721167 |
N = number of pairs, W = test value W, p = p-value |
Hypothesis RH3
stats/hypo3/empirical/
.
There are both input and output data available, alongside with all plots presented in this section.
In particular, all data regarding unique coordination requirements per artifact abstraction level are available for all subject systems in this folder as well (file hypo3-setdiffs.txt
).
Paired Wilcoxon signed-rank test | ||
---|---|---|
H0: | fraccr(file, m) ≥ fraccr(⏺, m) | fraccr(function, m) ≥ fraccr(⏺, m) |
fraccr(function, ⏺) | N = 10, W = 152, p ≈ 0.12 | |
fraccr(feature, ⏺) | N = 10, W = 64, p ≈ 0.98 | N = 10, W = 52, p ≈ 0.98 |
N = number of pairs, W = test value W, p = p-value |
not identified by … | File | Function | Feature | |||
---|---|---|---|---|---|---|
crfound | crfull | crfound | crfull | crfound | crfull | |
File | — | — | 0 | 0 | 91 | 30 |
Function | 1074 | 438 | — | — | 183 | 72 |
Feature | 1718 | 753 | 736 | 357 | — | — |
Combined | 982 | 396 | 0 | 0 | 91 | 30 |
Hypothesis RH4
stats/hypo4/empirical/
.
The history plots for all subject systems and motifs are available in the download section separately.
After extracting the downloaded file, see folder history-plots/
.
Project | Dm▵ | Dm□ |
---|---|---|
QEMU | 1.39 | 1.43 |
U-Boot | 1.40 | 1.46 |
FFmpeg | 1.49 | 1.55 |
LLVM | 1.51 | 1.54 |
PostgreSQL | 1.51 | 1.54 |
Wine | 1.57 | 1.64 |
BusyBox | 1.59 | 1.60 |
Git | 1.59 | 1.58 |
Apache HTTP | 1.65 | 1.71 |
OpenSSL | 1.67 | 1.69 |
Downloads
For reasons of data privacy and data size, we cannot distribute the raw data that we gathered for our subject systems but only processed data and results. Please refer to the tools Codeface and coronet to produce a set of data for yourself. You can find more information on the selected set of subject systems, the analyzed time ranges, and all needed further information in our subject-system list available above.
Downloadable assets:
- Subject-system details: subject-systems-details.ods
- Codeface configuration files: network-coordination-requirements_configurations.zip
- Analysis scripts: network-coordination-requirements_scripts.zip
- Statistical results: network-coordination-requirements_stats.zip
- History plots (Hypothesis RH4): network-coordination-requirements_plots.zip
- Plots for sensitivity analysis (Hypothesis RH1): network-coordination-requirements_plots-sensitivity.zip
Analysis Scripts
To reproduce the data for an individual subject project, the data needs to be processed by Codeface and codeface-extraction first – please use the configuration files provided above.
Afterwards, the output data can be processed using our analysis scripts.
The main script of our analysis is analysis.R
and needs to be used in each and every stage of the analysis (show/hide command-line interface).
Overall, there are five different stages in our analysis, which need to be run consecutively and which (mostly) run for one subject system at a time (for each, data output is cached appropriately):
data
: run the empirical analysis by constructing coordination-requirement networks and searching for motifsnull
: construct null-model networks and search for motifssensitivity
: construct sensitivity model following Kossinets (2006) and search for motifsstats
: run all statistical tests and construct corresponding plots (independent of configured subject system)history-plots
: construct the history plots for Hypothesis RH4
Based on the configuration files provided by us, the following parameters should be given to each analysis stage (see command-line interface above):
script
: a stage as described abovecasestudy
: subject-system name as in Codeface configurationartifact
: the artifact abstraction level to useartifact_relation
: must to be set to"cochange"
to reproduce the study-d CODEFACE_DATA
: set to path where Codeface output is placed-w "3 months"
: use three-month revision ranges-s "threemonth"
(this is an abstraction level in Codeface to structure analyses)--null-model="rewire"
: use the null model based on rewiring-no-remove-core
: Do not remove core developers in null-model analysis
As a consequence, an exemplary call to gather feature-based empirical data for Apache HTTP looks like this:
Rscript analysis.R data apache-http feature cochange -d "/path/to/codeface-data/" -s "threemonth" -w "3 months" --loglevel "DEBUG"
Contact
If you have any questions regarding this paper or any other related project, please do not hesitate to contact us:
- Claus Hunsen (University of Passau, Passau, Germany)
- Janet Siegmund (Chemnitz University of Technology, Chemnitz, Germany)
- Sven Apel (Saarland University, Saarbrücken, Germany)