On the Fulfillment of Coordination Requirements in Open-Source Software Projects:
An Exploratory Study

— Empirical Software Engineering: Supplementary Website

Claus Hunsen, Janet Siegmund, Sven Apel

Abstract

Coordination among developers is crucial in large-scale open-source software projects, where developers are often distributed across the entire planet. By assessing the alignment of collaboration and communication in such software projects in terms of coordination requirements, we can estimate whether a state of socio-technical congruence is achieved, which is associated with software quality and project success. By means of an empirical study on a substantial set of large-scale open-source software projects (including OpenSSL, Git, and LLVM)—altogether making up over 180 years of development history—we aim at shedding light at this issue. Compared to the state of the art in this research area, we do not only identify coordination requirements arising from files and functions only, but also those arising from features. This way, we take a more semantic view on this phenomenon. We found that open-source developers fulfill coordination requirements on purpose, but mostly those coordination requirements arising from coupled source-code artifacts, while they resolve simpler ones independently. Furthermore, we found that neither of the considered abstraction levels of source-code artifacts (files, functions, features) is more suitable as constructional argument for coordination requirements with respect to their fulfillment. This finding strongly indicates that features do not play an as important role in the development process as expected and commonly believed by the research community in the area of feature-oriented and feature-driven development. Finally, we identified notable evolutionary trends in the fulfillment of coordination requirements and showed that far-reaching social events have a huge impact on their fulfillment, both negatively and positively. The key findings of our empirical study are that socio-technical relations are important to understand open-source development communities and that the incorporation of different abstraction levels for developer collaboration does yield important insights to further improve the evolution in open-source software projects.

Keywords: coordination requirements socio-technical congruence social-network analysis coronet Codeface open-source software systems configurable systems software product lines feature-oriented software development

Research Questions and Hypotheses

Research Questions

RQ1
Does developer communication align with artifact-based coordination requirements in real-world OSS projects such that the coordination requirements are fulfilled?
RQ2
Does developer communication align better with feature-based coordination requirements than with function-based or file-based coordination requirements?
RQ3
Does the degree of fulfillment of coordination requirements change for different artifact types during project evolution??

Hypotheses

RH1
A high number of developer pairs collaborating on the same artifacts do exchange e-mails on the same threads of the mailing list, such that coordination requirements arising from any type of artifact are fulfilled not only by chance.
RH2
The fraction of fulfilled coordination requirements is lower for the square motif than for the triangle motif, independent of the observed artifact abstraction.
RH3
The fraction of fulfilled coordination requirements differs for the different artifacts, and is significantly highest for abstraction level of features.
RH4
In later stages of development, the fraction of fulfilled coordination requirements is higher than in earlier stages for all motifs and artifacts.

Network Approach

Coordination-Requirement Networks and Motifs

To analyze the fulfillment of coordination requirements in a software project, we construct coordination-requirement networks, which we can analyze with network-analytic methods. We show an exemplary coordination-requirement network in Figure 1. Formally, such a network is defined as an undirected graph G = (DA, E), where we encode developers (, set D) and artifacts (, set A) as vertices; E is the set of edges among the vertices. We encode the following three relations in the edges:

...
Figure 1. An exemplary coordination-requirement network. Circles () represent developers, dashed edges among developers () represent coordination effort. Squares () represent artifacts, dashed edges among artifacts () represent coupling among the connected artifacts. Developers are connected to artifacts (solid edges, ) if they worked on that artifact in a commit.

To automatically identify coordination requirements in coordination-requirement networks, we encode coordination requirements or, rather, the patterns they represent as network motifs. Network motifs are recurrent sub-graphs in a given network [75]. Motifs can be described formally as a set of vertices (e.g., {d1, d2, a1}) with specific edges connecting them. We show two network motifs for coordination requirements in Figure 2, the triangle motif and the square motif.

Figure 2. Triangle and square motifs. Edges among artifacts () represent coupling, while a developer is connected to an artifact (), if they worked on that artifact. An edge among two developers () represents coordination, where the edge’s existence indicates the fulfillment of the encoded coordination requirement.

Abstraction Levels for Source-Code Artifacts

In previous work [14, 13], researchers have tracked concurrent contributions of developers on the same file to derive coordination requirements. We conjecture that this view may be too technical to capture the richness of coordination. Thus, we introduce the two code-artifact abstractions function and feature to infer coordination requirements at different levels of abstraction. Although these abstraction levels are based on heuristics, they have been shown to be reliable in multiple previous studies [51, 55, 43, 44]. Additionally, with regard to network constructions, the abstraction file has been shown to produce dense networks that are known to hinder community detection [10, 43] and, thus, represent more precise coordination relations among developers. It has been already shown that a function-level view is more accurate [44].

For illustration, we show how strongly the choice of abstraction level influences the extraction of coordination requirements by means of the triangle motif and its manifestation . We show the resulting coordination-requirement networks in Figure 4.

diff --git a/actions.c b/actions.c
index d4ea8ff..ecb9f59 100644
--- a/actions.c
+++ b/actions.c
@@ -1,8 +1,11 @@ Changes by Dev C
void delete (struct DBconnection *DBconn,
          char *command) {
 //...
+ #ifdef PERSIST
+    persist();
+ #endif
}

 #ifdef FEATURE_LOCKING
void lockOnAction(struct DBaction *action) {

@@ -11,12 +11,12 @@ Changes by Dev C
-    // old code
+    // Dev C makes changes here

@@ -13,21 +13,21 @@ Changes by Dev B
-    if (data == NULL) {
+    if (data != NULL) {
      // ...
 }

 // more code

}
 #endif
diff --git a/db.c b/db.c
index 130f79c..d1c26b1 100644
--- a/db.c
+++ b/db.c
@@ -1,6 +1,6 @@ Changes by Dev A
 #ifdef PERSIST
-void persist() {
-    // old code
+void persist(char *filename) {
+    // completely rewritten code
}
 #endif

@@ -7,10 +7,10 @@ Changes by Dev B
void execute(struct DBConn *conn,
          struct DBAction *action) {
-    // old code
+    // code with bugfixes
}
Figure 3. Code example containing two files, four functions, and two features (controlled by #ifdef directives). Lines starting either with + or indicate patch blocks applied individual developers.
Figure 4. Coordination-requirement networks (excluding coordination edges) extracted from the source code in for each of the abstraction levels file, function, and feature.

Statistics and Formulas

To analyze the alignment of the email-based developer coordination and the actual artifact-based collaboration, we measure the fraction of fulfilled coordination requirements. Given a coordination-requirement network constructed using one type of code artifact (i.e., file, function, or feature) and a motif m to identify coordination requirements, we define the fraction fraccr(a, m) of fulfilled coordination requirements as follows:

fraccr(a, m) = |crfull(a, m)| / |crfound(a, m)|, where

crfound(a, m) = { c | matched instance c of motif m for artifact a in the current network } and
crfull(a, m) = { cf | cf ∈ crfound(a, m), cf is fulfilled },

Tools

For data extraction, we mainly use the tool Codeface. Based on the Codeface results, we construct and analyze coordination-requirement networks using our network-construction library coronet and a set of self-written R scripts. Our script setup is available in the Downloads section.

Codeface

CCodeface logo

Codeface is a framework and interactive web frontend for the social and technical analysis of software development projects.

coronet

coronet logo

coronet is a library to construct socio-technical developer networks based on various data sources in a configurable and reproducible way.

R scripts

R logo

We developed a set of R scripts on top of Codeface and coronet for our analysis, which are available in the Downloads section.

Subject Systems

Table 2.1. List of subject projects.
Project Time # Commits # E-Mails # Developers
Apache HTTP 1996–2017 29704 54921 2146
BusyBox 1999–2016 14313 42013 2736
FFmpeg 2000–2017 80605 242295 5998
Git 2005–2017 34898 338500 9246
LLVM 2001–2017 158562 706716 6407
OpenSSL 1998–2016 18143 32659 4786
PostgreSQL 1996–2017 44062 320711 4647
QEMU 2003–2016 46633 430561 7205
U-Boot 1988–2017 44736 319160 7924
Wine 1993–2017 121815 111333 4087

Results

Overview on Subject Systems

Table 2.2. Empirical data on fulfilled coordination requirements for all subject projects.
Project Artifact # Artifacts Triangle motif m
Square motif m
crfound crfull fraccr crfound crfull fraccr
Apache HTTP file 1366 5710 2818 0.49 316417 134993 0.43
function 16869 2830 1612 0.57 173364 121281 0.70
feature 1357 654 322 0.49 7224 4380 0.61
BusyBox file 1370 2021 907 0.45 181236 69771 0.38
function 10942 1661 756 0.46 164853 29122 0.18
feature 2534 670 312 0.47 18384 3407 0.19
FFmpeg file 3257 42978 18277 0.43 4030726 1886168 0.47
function 33078 19303 8658 0.45 1154399 438927 0.38
feature 2079 5742 2369 0.41 110 51972 0.47
Git file 1740 13690 4798 0.35 240103 77443 0.32
function 11937 7151 2473 0.35 128572 39928 0.31
feature 175 185 119 0.64 209 177 0.85
LLVM file 5619 119428 42219 0.35 9974743 3832917 0.38
function 50201 17647 8457 0.48 7912 571735 0.57
feature 937 8495 2432 0.29 26508 9420 0.36
OpenSSL file 1444 5295 1922 0.36 712756 307113 0.43
function 12941 3044 1195 0.39 183091 88851 0.49
feature 1132 3445 1134 0.33 153236 68464 0.45
PostgreSQL file 2192 21798 16671 0.76 5147123 4210 0.82
function 34960 14237 10797 0.76 3193322 2551078 0.80
feature 1061 1764 1339 0.76 19753 13158 0.67
QEMU file 3227 36467 23096 0.63 3466452 2486040 0.72
function 57955 15394 10365 0.67 1166730 810775 0.69
feature 1753 13892 5433 0.39 156289 59658 0.38
U-Boot file 8257 11096 5755 0.52 307816 185906 0.60
function 63067 4664 2680 0.57 162257 415 0.62
feature 7065 20711 6931 0.33 423328 147434 0.35
Wine file 5568 45088 19 0.44 1463843 817654 0.56
function 164073 23665 12506 0.53 1431211 818966 0.57
feature 1687 6348 2328 0.37 32501 12884 0.40
Table 2.3. Commits per developer in subject projects (mean, standard deviation, .8 quantile, maximum).
Project # Commits # Commits per developer
Avg. ± Std. dev. Median .8 quantile Max.
Apache HTTP 29671 237.37 ± 417.48 72.00 360.00 2452
BusyBox 14259 53.01 ± 433.51 1.00 5.00 6495
FFmpeg 80535 52.84 ± 560.22 2.00 9.00 19516
Git 34872 23.50 ± 159.14 2.00 10.00 3989
LLVM 158519 183.47 ± 1023.25 17.00 104.00 26580
OpenSSL 18077 62.33 ± 388.51 1.00 5.00 4535
PostgreSQL 44010 1047.86 ± 2821.8 166.50 778.60 13327
QEMU 46578 43.82 ± 188.6 3.00 18.00 2505
U-Boot 44680 27.48 ± 146.9 3.00 18.00 4154
Wine 121731 79.82 ± 532.83 2.00 17.20 14089
Table 2.4. Commits per artifact in subject projects (mean, standard deviation, .8 quantile, maximum).
Project Artifact # Commits per artifact
Avg. ± Std. dev. Median .8 quantile Max.
Apache HTTP file 16.37 ± 43.77 1.00 17.00 469
function 3.08 ± 40.53 1.00 3.00 5227
feature 8.18 ± 321.1 1.00 3.00 21200
BusyBox file 20.81 ± 39.83 8.00 29.00 645
function 4.96 ± 54.18 2.00 6.00 5620
feature 6.58 ± 201.26 2.00 5.00 17700
FFmpeg file 26.81 ± 69.74 9.00 33.00 1873
function 4.82 ± 114.61 2.00 5.00 20796
feature 24.07 ± 1323.09 2.00 5.00 97296
Git file 18.24 ± 35.96 7.00 24.00 518
function 4.59 ± 92.82 2.00 5.00 10127
feature 85.7 ± 1796.61 1.00 3.00 39118
LLVM file 31.2 ± 98.95 6.00 32.00 4182
function 4.55 ± 323.23 2.00 4.00 72385
feature 57.66 ± 3108.01 1.00 3.00 181534
OpenSSL file 20.29 ± 32.48 11.00 28.00 392
function 4.25 ± 40.11 2.00 5.00 4521
feature 8.24 ± 188.34 2.00 4.00 13572
PostgreSQL file 38.37 ± 73.76 13.00 48.00 862
function 5.25 ± 76.96 2.00 6.00 14305
feature 18.61 ± 740.11 2.00 5.00 43704
QEMU file 22.26 ± 55.97 7.00 27.00 1401
function 3.2 ± 79.15 2.00 4.00 19012
feature 16.82 ± 843.12 1.00 4.00 66572
U-Boot file 6.91 ± 11.13 4.00 9.00 238
function 2.11 ± 62.79 1.00 2.00 15762
feature 5.12 ± 257.7 1.00 3.00 41286
Wine file 33.3 ± 68.9 12.00 45.00 1824
function 3.65 ± 111.78 2.00 4.00 45243
feature 41.16 ± 2579.56 1.00 4.00 192364

Hypothesis RH1

After downloading and extracting the statistical results, the raw results for this hypothesis can be found in the folders stats/hypo1-collect/ (raw data, empirical and null model) and stats/hypo1/empirical/ (statistical tests). There are both input and output data available, alongside with all plots presented in this section.
...
Figure 5. Fraction of fulfilled coordination requirements fraccr(a, m) per motif m and artifact a as violin plot
...
Figure 6. Fraction of fulfilled coordination requirements fraccr(a, m) per motif m and artifact a as violin plot
Table 3. Paired Wilcoxon signed-rank test regarding Hypothesis RH1, data paired by motif m and artifact a (H_0: fraccr(a, m) ≤ fraccrnull(a, m), N = 10 for all tests).
Paired Wilcoxon signed-rank test
H0: fraccr(⏺, m) ≤ fraccrnull(⏺, m) fraccr(⏺, m) ≤ fraccrnull(⏺, m)
fraccr(file, ⏺) W ≈ 54, p < 0.01*, δ = 0.42 $W ≈ 48, p ≈ 0.02*, δ = 0.12
fraccr(function, ⏺) W ≈ 53, p < 0.01*, δ = 0.48 W ≈ 53, p < 0.01*, δ = 0.24
fraccr(feature, ⏺) W ≈ 54, p < 0.01*, δ = 0.3 W ≈ 51, p < 0.01*, δ = 0.26
W = test value W, p = p-value, * for p < 0.05, δ = Cliff's δ effect size
Hypothesis RH1: Accepted. The comparison of the empirical data on the fulfillment of coordination requirements (for both types of motifs and across all artifacts) to the the respective values of the null model shows that the identified coordination requirements are indeed not fulfilled by chance.

Sensitivity Analysis following Kossinets (2006)

We performed a sensitivity analysis following Kossinets (2006) to investigate on the stability of our results. In detail, we used the simulation algorithm "BSPC" (boundary specification problem for contexts) to simulate the absence of coordination effort from the mailing list (which may occur on different platforms such as face-to-face meetings or chats instead) and, thus, incomplete information sources (i.e., mailing-list data) – similar to the null models (see Section 3.2.4). The algorithm removes a defined number of random e-mail threads before constructing analyzable coordination-requirement networks and calculates the metrics as previously defined. To this end, for the projects BusyBox, Git, LLVM, and OpenSSL, we randomly removed 10, 20, …, 90 percent of the e-mail threads, performed 25 iterations for better randomization, calculated mean values, and analyzed the final results. In short, we found for the selected projects that the removal of 10% of all e-mail threads produces a relative error of about 15% for fraccr, across all revision ranges and for all motifs and source-code artifacts. With 20% of all e-mail threads being randomly removed, the metric exhibits a relative error of about 25%, on average. Results can be opened/displayed in the table below. These results indicate that the absence of crucial developers may have an immediate and extensive effect on most projects and emphasize that any coordination effort is important to fulfill coordination requirements.

Project Triangle motif m Square motif m
BusyBox # #
Git # #
LLVM # #
OpenSSL # #

Hypothesis RH2

After downloading and extracting the statistical results, the raw results for this hypothesis can be found in the folder stats/hypo2/empirical/. There are both input and output data available, alongside with all plots presented in this section.
...
Figure 7. Fraction of fulfilled coordination requirements fraccr(a, m) per motif m and artifact a as violin plot
...
Figure 8. Fraction of fulfilled coordination requirements fraccr(a, m) per motif m and artifact a as violin plot
Table 4. Results regarding Hypothesis RH2: Paired Wilcoxon signed-rank test for fraccr(a, m and fraccr(a, m, paired by subject project and artifact a.
Paired Wilcoxon signed-rank test
H0: fraccr(a, m) ≤ fraccrnull(a, m)
fraccr(a, m)
fraccr(a, m) N = 30, W = 325, p ≈ 0.9721167
N = number of pairs, W = test value W, p = p-value
Hypothesis RH2: Rejected. The comparison of empirical data on the fulfillment of coordination requirements for the triangle motif m and the corresponding data for the square motif m shows that the fulfillment of the identified coordination requirements is not higher for the triangle motif.

Hypothesis RH3

After downloading and extracting the statistical results, the raw results for this hypothesis can be found in the folder stats/hypo3/empirical/. There are both input and output data available, alongside with all plots presented in this section. In particular, all data regarding unique coordination requirements per artifact abstraction level are available for all subject systems in this folder as well (file hypo3-setdiffs.txt).
...
Figure 9. Fraction of fulfilled coordination requirements fraccr(a, m) per motif m and artifact a as violin plot
Table 5. Results regarding Hypothesis RH3: Paired Wilcoxon signed-rank test for the metrics fraccr(file, m), fraccr(function, m), and fraccr(feature, m), paired by subject project and motif m.
Paired Wilcoxon signed-rank test
H0: fraccr(file, m) ≥ fraccr(⏺, m) fraccr(function, m) ≥ fraccr(⏺, m)
fraccr(function, ⏺) N = 10, W = 152, p ≈ 0.12
fraccr(feature, ⏺) N = 10, W = 64, p ≈ 0.98 N = 10, W = 52, p ≈ 0.98
N = number of pairs, W = test value W, p = p-value
Table 6. Number of unique coordination requirements per artifact abstraction level for Apache HTTP and the triangle motif m, identified only for the abstraction level of the column (subcolumns crfound), but not for the abstraction level of the row. The columns crfull indicate how many of these unique coordination requirements are fulfilled.
not identified by … File Function Feature
crfound crfull crfound crfull crfound crfull
File 0 0 91 30
Function 1074 438 183 72
Feature 1718 753 736 357
Combined 982 396 0 0 91 30
Hypothesis RH3: Rejected. The hypothesis that coordination requirements at the feature level are significantly more often fulfilled than for the other artifact abstractions is not supported by our data.

Hypothesis RH4

After downloading and extracting the statistical results, the data of the statistical tests and fractal dimension D can be found in the folder stats/hypo4/empirical/. The history plots for all subject systems and motifs are available in the download section separately. After extracting the downloaded file, see folder history-plots/.
...
Apache HTTP
...
BusyBox
...
FFmpeg
...
Git
...
LLVM
...
OpenSSL
...
PostgreSQL
...
QEMU
...
U-Boot
...
Wine
Figure 10. Fraction of fulfilled coordination requirements fraccr(a, m) (triangle motif) for all subject projects (only revision ranges with sent e-mails shown)
...
Apache HTTP
...
BusyBox
...
FFmpeg
...
Git
...
LLVM
...
OpenSSL
...
PostgreSQL
...
QEMU
...
U-Boot
...
Wine
Figure 11. Fraction of fulfilled coordination requirements fraccr(a, m) (square motif) for all subject projects (only revision ranges with sent e-mails shown)
Table 7. Fractal-dimension values D for all subject projects and motifs, sorted by Dm and grouped by similar values.
The groups of values are derived in combination with the plots in Figure 10 and Figure 11.
Project Dm Dm
QEMU 1.39 1.43
U-Boot 1.40 1.46
FFmpeg 1.49 1.55
LLVM 1.51 1.54
PostgreSQL 1.51 1.54
Wine 1.57 1.64
BusyBox 1.59 1.60
Git 1.59 1.58
Apache HTTP 1.65 1.71
OpenSSL 1.67 1.69
Hypothesis RH4: Rejected. The hypothesis that the fraction of fulfilled coordination requirements fraccr(a, m) improves over time cannot be shown across the complete set of subject projects. Instead, we found several different patterns in the organizational evolution indicating that there are very project-specific reasons leading to fulfilled and unfulfilled coordination requirements, such as a change of maintainers or even an attempted project-takeover.

Downloads

Downloadable assets:

Analysis Scripts

To reproduce the data for an individual subject project, the data needs to be processed by Codeface and codeface-extraction first – please use the configuration files provided above. Afterwards, the output data can be processed using our analysis scripts. The main script of our analysis is analysis.R and needs to be used in each and every stage of the analysis ().

usage: analysis.R [-h] [-s SELECTION_PROCESS] [-w SPLIT_WINDOW]
                  [--sliding-window] [-d CODEFACE_DATA]
                  [--null-model NULL_MODEL] [--no-remove-core]
                  [--complete-rerun] [--bootstrap-packrat]
                  [--loglevel LOGLEVEL]
                  script casestudy artifact artifact_relation

positional arguments:
  script                The subscript to execute
  casestudy             Casestudy name as in Codeface-data folder name
  artifact              Artifact type to use in the analysis
  artifact_relation     Artifact-relation type to use in the analysis

optional arguments:
  -h, --help            show this help message and exit
  -s SELECTION_PROCESS, --selection-process SELECTION_PROCESS
                        The selection process for revision windows [default
                        "releases"]
  -w SPLIT_WINDOW, --split-window SPLIT_WINDOW
                        The time-window length used for data splitting (e.g.,
                        "3 months" and "2 weeks") [default "3 months"]
  --sliding-window      Do you want overlapping time windows (i.e., sliding-
                        window approach)? [default "False"]
  -d CODEFACE_DATA, --codeface-data CODEFACE_DATA
                        Path to Codeface data [default "/local/hunsen/projects
                        /codeface-data/"]
  --null-model NULL_MODEL
                        The specific null model to use
                        ('rewire'|'random'|'errg') [default "rewire"]
  --no-remove-core      Remove core authors from null model? [default "False"]
  --complete-rerun      Do a complete re-run of the analysis? [default
                        "False"]
  --bootstrap-packrat   Bootstrap packrat library?
  --loglevel LOGLEVEL   Log level
Figure 12. Command-line interface of analysis.R

Overall, there are five different stages in our analysis, which need to be run consecutively and which (mostly) run for one subject system at a time (for each, data output is cached appropriately):

  1. data: run the empirical analysis by constructing coordination-requirement networks and searching for motifs
  2. null: construct null-model networks and search for motifs
  3. sensitivity: construct sensitivity model following Kossinets (2006) and search for motifs
  4. stats: run all statistical tests and construct corresponding plots (independent of configured subject system)
  5. history-plots: construct the history plots for Hypothesis RH4

Based on the configuration files provided by us, the following parameters should be given to each analysis stage (see above):

As a consequence, an exemplary call to gather feature-based empirical data for Apache HTTP looks like this:

Rscript analysis.R data apache-http feature cochange -d "/path/to/codeface-data/" -s "threemonth" -w "3 months" --loglevel "DEBUG"

Contact

If you have any questions regarding this paper or any other related project, please do not hesitate to contact us: