Package Manager Flow
Posted on 12/10/2022 | đïž Edit on GitHubOverview
We should eventually be able to describe the makeup of a package manager - akin to how the Open Containers Initiative has an image-spec, distribution-spec, and runtime-spec for containers, we want similar specs to describe actions and states for package management. I donât have a holistic view yet of what this should look like, but Iâll include interesting notes that are relevant. Here is an early understanding of a general flow:
user preferences --> package manager --> translate to upgrade problem (CUDF) --> solver --> --> CUDF-encoded solution --> scenario --> [if solution] --> action [else] rollback
The above says that we start with user preferences that are assembled by the package manager, hand them off to a solver (package manager agnostic, used like plugins) to come up with a solution. There are several ways to describe a solution (scenario) that determines if we move forward or not. The input and output to the solver is a CUDF document that describes the needed changes. There are also terms that can describe the scenarios terms
Software Maintenance Technologies
The âingredientsâ of âsoftware maintenance technologiesâ according to [1] are:
- dependencies
- conflicts
- package managers with dependency solving capabilities
Package Manager
The package manager should generally have the following inputs and outputs [2]
- Take as input:
- the current status of packages on the system
- a universe of all available packages (in other papers called
U
) - a user request (e.g., âinstall xyz, remove xyzâ)
- user preferences (e.g., âminimize updated softwareâ)
- Return as output:
- an upgrade plan: a list of actions to take on the system to reach the status that the user wants (e.g., installing thething)
And we can evaluate these steps based on expressivity - how empowered the user is to express preferences, and completeness - being able to propose a valid update plan (the output) when one exists.
In [2] they (mostly same) authors make a similar statement, saying that package managers should (this is verbatim):
- devise upgrade plans that are correct (i.e. no plan that violates component expectations is proposed) and complete (i.e. every time asuitable plan exists, it can be found)
- have performances that scale up gracefully at component repositories growth
- empower users to express preferences on the desired component configuration when several options exist, which is often the case
User preferences
The user should be able to specify high level criteria to describe what they want (e.g., minimize changed packages). Of course the more criteria we define, the more likely we are to not be able to find a solution or have something pareto-optimal. Approaches to deal with this include:
- Lexicographic: order criteria by importance (e.g., a security update gets ranked higher)
- Weighted sum: aggregate criteria into single measure using some set of weights
- lexmin and lexmax: (havenât read about yet)
The authors in [3, P. 14] propose to do the following:
- define a dictionary of criteria
- define a dictionary of aggregation functions (e.g., the above)
- write the user preference as an expression of the aggregation function (op) and directionality (e.g., +/-) for each criteria. E.g.,
lex(-removed, -changed)
Says use lexmin to aggregate, and minimize changed and removed packages. They present a nice table of optimization criteria:
And this is a neat idea because you can then describe specific scenarios (and name them)
paranoid=lex(âremoved,âchanged)
trendy=lex(âremoved,ânotuptodate,âunsatrec,ânew)
Scenarios
The paper [3, P. 12] describes different solution scenarios, rooted in CUDF, and this would be a good terminology to harden to better describe our work.
We start with a CUDF document
and define a solution to it
as a subset (S
) of the entire package universe (U
). We then say that dom(S)
(domain) are the pairs
of packages (e.g., (name, version)
in our subset (e.g., what CUDF is asking for) in S
, and pro(S)
(provided)
the same (name, version)
pairs of all that are available, which can be infinite (gulp!).
This defines an atomic package constraint. A subset S
in U
is (these are written verbatim from the paper):
- abundant if every disjunction in the dependency of every package in
S
contains a package constraint that is satisfied bydom(S) âȘ pro(S)
. I think this is saying that a solution is abundant if every nested dependency has itâs own dependency needs met by either the set of pairs that we need, or the set of pairs available to us. Itâs abundant possibly because it says that the universe is rich with everything that we need. - peaceful if no atomic package constraint in the conflicts of any package
pâS
is satisfied bydom(Sâ{p}) âȘ pro(Sâ{p}
). I think this is saying that if we removed p from both sets, there would be no conflict.
Requests
A subest S
of the universe U
satisfies a request:
- install Ï if every atomic constraint in Ï is satisfied by
dom(S) âȘ pro(S)
. This is saying we move forward to install if the universe is abundant?) - remove Ï if no element of Ï is satisfied by
dom(S) âȘ pro(S)
.
A set S
is only a solution if it is abundant, healthy, and satisfies the request.
Solutions are allowed to have several packages with the same name and different versions.
Terms
We can further classify solutions based on:
- correctness: have we satisfied the user request?
- quality: do I like the solution?
Solvers
It looks like there are many solver technologies, as references by [3] in the MISC competition (Mancoosi International Solver Competition ref) including but not limited to:
- boolean satisfiability (SAT)
- Mixed Integer Linear Programming (MILP)
- Answer Set Programming (ASP)
- graph constraints
And I think we intend to work on one based on ABI compatability. Many of these can handle upgrade problems encoded as CUDF documents.
- P. Abate, R. Di Cosmo, R. Treinen, and S. Zacchiroli, âDependency solving: a separate concern in component evolution management,â Journal of Systems and Software, vol. 85, no. 10, pp. 2228â2240, 2012. Full details | PDF
- P. Abate, R. Di Cosmo, G. Gousios, and S. Zacchiroli, âDependency Solving Is Still Hard, but We Are Getting Better at It,â in 2020 IEEE 27th International Conference on Software Analysis, Evolution and Reengineering (SANER), 2020, pp. 547â551. Full details | PDF
- P. Abate, R. Di Cosmo, R. Treinen, and S. Zacchiroli, http://www.sciencedirect.com/science/article/pii/S0950584912001851âA modular package manager architectureâ</a> Information and Software Technology, vol. 55, no. 2, pp. 459â474, 2013. Full details | PDF