Who cares whether some computer program is a whole, how, and why? Turns out, more people than you may think—and so should you, since it can be costly depending on the answer. Consider the following two scenarios: 1) you download a ‘pirated’ version of MS Office or Adobe Photoshop (the most popular ones still) and 2) you take the source code of a popular open source program, such as Notepad++, add a little code for some additional function, and put it up for sale only as an executable app called ‘Notepad++ extreme (NEXT)’ so as to try to earn money quickly. Are these actions legal?
In both cases, you’d break the law, but how many infringements took place, of the one that you potentially could be fined for or face jail time? For the piracy case, is that once for the MS Office suite, or for each progam in the suite, or for each file created upon installing MS office, or for each source code file that went into making the suite during software development? For the open source case, was that violating its GNU GLP open source licence once for the zipped&downloaded or cloned source code or for each file in the source code, of which there are hundreds? It is possible to construct similar questions for trade secret violations and patent infringements for programs, as well as other software artefacts, like illegal downloads of TV series episodes (going strong during COVID-19 lockdowns indeed). Just in case you think this sort of issue is merely hypothetical: recently, Arista paid Cisco $400 million for copyright damages and just before that, Zenimax got $500 million from Oculus (yes, the VR software) for trade secret violations, and Google vs Oracle is ongoing with “billions of dollars at stake”.
Let’s consider some principles first. To be able to answer the number of infringements, we first need to know whether a computer program is a whole or not and why, and if so, what’s ‘in’ (i.e., a part of it) and what’s ‘out’ (i.e., definitely not part of it). Spoiler alert: a computer program is a functional whole.
To get to that conclusion, I had to combine insights from theories of parthood (mereology), granularity, modularity, unity, and function and add a little more into the mix. To provide less and more condensed versions of the argumentation, there is a longer technical report [1], of which I hope it is readable by a wider audience, and a condensed version for a specialist audience [2] that was published in the Proceedings of the 11th Conference on Formal Ontologies in Information Systems (FOIS’20) two weeks ago. Very briefly and informally, the state of affairs can be illustrated with the following picture:

This schematic representation shows, first, two levels of granularity: level 1 and level 2. At level 1, there’s some whole, like the a1 and a2 in the figure that could be referring to, say, a computer program, a module repository, an electorate, or a human body. At a more fine-grained level 2, there are different entities, which are in some way linked to the respective whole. This ‘link’ to the whole is indicated with the vertical dashed lines, and one can say that they are part of the whole. For the blue dots on the right residing at level 2, i.e., the parts of a1, there’s also a unifying relation among the parts, indicated with the solid lines with arrows, which makes a1 an integral whole. Moreover, for that sort of whole, it holds that if some object x (residing at level 2) is part of a1 then if there’s a y that is also part of a1, it participates in that unifying relation with x and vice versa (i.e., if y is in that unifying relation with x, then it must also be part of a1). For the computer program’s source code, that unifying relation can be the source tree graph.
There is some nitty gritty detail also involving the notion of function—a source code file contributes to doing something—and optional vs mandatory vs essential part that you can read about in the report or in the paper [1,2], covering the formalisation, more argumentation, and examples.
How would it pan out for the infringements? The Notepad++ exploitation scenario would simply be a case of one infringement in total for all the files needed to create the executable, not one for each source code file. This conclusion from the theory turns out remarkably in line with the GNU GPL’s explanation of their licence, albeit then providing a theoretical foundation for their intuition that there’s a difference between a mere aggregate where different things are bundled, loose coupling (e.g., sockets and pipes) and a single program (e.g., using function calls, being included in the same executable). The order of things perhaps should have been from there into the theory, but practically, I did the analysis and stumbled into a situation where I had to look up the GPL and its explanatory FAQ. On the bright side, in the other direction now then: just in case someone wants to take on copyleft principles of open source software, here are some theoretical foundations to support that there’s probably much less money to be gained than you might think.
For the MS Office suite case mentioned at the start, I’d need a look under the hood to determine how it ties together and one may have to argue about the sameness of, or difference between, a suite and a program. The easier case for a self-standing app, like the 3rd-place most pirated Windows app Internet Download Manager, is that it is one whole and so one infringement then.
It’s a pity that FOIS 2020 has been postponed to 2021, but at least I got to talk about some of this as expert witness for a litigation case and I managed to weave an exercise about the source tree with open source licences into the social issues and professional practice module I thought to some 750 students this past winter.
References
[1] Keet, C.M. Why a computer program is a functional whole. Technical report 2008.07273, arXiv. 21 July 2020. 25 pages.
[2] Keet, C.M. The computer program as a functional whole. Proc. of FOIS 2020. Brodaric, B. and Neuhaus, F. (Eds.). IOS Press. FAIA vol. 330, 216-230.