Welcome! Please see the About page for a little more info on how this works.

0 votes
in tools.namespace by

Problem

I noticed that when a dependency, example.one, was removed from another namespace, example.two, example.one still showed up under [::track/deps :dependents 'example.two] in the tracker after an invocation of repl/scan.

After digging into this for some time, I saw that the implementation of remove-node in MapDependencyGraph didn't make any changes to the :dependents attribute.

This seems like a mistake to me, as the remove-node docstring reads:

Removes the node from the dependency graph without removing it as a
dependency of other nodes. That is, removes all outgoing edges from
node.

The implementation of remove-node follows:

(remove-node [graph node]
  (MapDependencyGraph.
    (dissoc dependencies node)
    dependents))

This removes outgoing edges from dependencies, but if dependents is effectively an inverse of dependencies, then I'd expect to see some sort of removal from dependents as well. Instead, dependents remains unmodified.

Proposed Fix

If this is indeed a bug, and not a misunderstanding on my part, I have a fix here along with a couple tests on my behavioral-tests branch that removes node from values in the dependents map.

Does this align with the expected behavior of remove-node, or am I misunderstanding the intended behavior of this function?

1 Answer

0 votes
by

I don't actually know the original intent but the docstring you quote says "without removing it as a dependency of other nodes" which sounds like what it does?

by
Yes and no. I'll add some more detail here.

Suppose we have a project with dependencies where `example.one` depends on `example.two` and `example.two` depends on `example.three`.

In this case, tracker would look something like this:
```
{
  :dependencies {'example.one #{'example.two}
                 'example.two #{'example.three}
                 }
  :dependents   {'example.two #{'example.one}
                 'example.three #{'example.two}
                 }
}
```

`:dependencies` represents the set of namespaces a given key depends on:
- `example.one` depends on `example.two`
- `example.two` depends on `example.three`

`:dependents` represents the set of namespaces dependent on a given key:
- `example.two` has one dependent: `example.one`
- `example.three` has one dependent: `example.two`

Both maps contain the same information, just structured as an inverse of one another. All four facts noted above can be derived from both maps.

If we were to apply the operation described by the docstring to the `example.two` namespace, we would be removing two of these facts (each an inverse of the other):
- `example.two` depends on `example.three`
- `example.three` has one dependent: `example.two`

Producing the updated structure:
```
{
  :dependencies {'example.one #{'example.two}}
  :dependents   {'example.two #{'example.one}
                 'example.three #{}
                 }
}
```

Now we only have the two remaining facts:
- `example.one` depends on `example.two`
- `example.two` has one dependent: `example.one`

The dependencies of `example.three` haven't changed, only those that are dependent upon it: `example.two`.

In the current implementation, `example.two` is being removed from `:dependencies`, which only removes its own dependencies. However, it's not being removed from those it depends on under `:dependents`, giving us this structure instead:
```
{
  :dependencies {'example.one #{'example.two}}
  :dependents   {'example.two #{'example.one}
                 'example.three #{'example.two}
                 }
}
```

Here, `example.three` has the dependent of `example.two`, implying that `example.two` depends on `example.three` (the fact that has already been removed from `:dependencies`).

Maybe this was a too long-winded explanation, so I hope that all makes sense.
by
When I read the docstring "Removes the node from the dependency graph without removing it as a dependency of other nodes. That is, removes all outgoing edges from node." that seems to describe exactly what the function is currently doing (removes the node, removes outgoing edges, DOES NOT remove incoming edges).

I agree that that behavior seems confusing in its utility, but presumably existing callers are expecting that. It would be helpful to double-check both internal and external (if findable via https://grep.app etc) uses to see if that seems true.

If so, then I would instead consider this a request for a new operation that does what you're after. If not, then still may not want to touch it for fear of breaking unknown callers, would maybe try to look further into the git history.
by
Sorry for the late reply here, things got busy the past couple weeks.

These are great ideas – I hadn't considered using grep on the git history.


Looks like there are only two usages of remove-node in tools.namespace (I still need to check on other clojure projects):
1. dependency/topo-sort
  - The change I made doesn't have any effect here, since the function already calls remove-edge elsewhere with each node and its own dependencies.
2. track/remove-deps > track/add
  - There are other functions much higher up the chain (dir/scan-files, dir/scan-dirs, repl/scan) where I noticed something off with the behavior. These are what I created the "proposed fix" tests against.


A search on "remove-node" in git history brought back a commit from 2021. Looks more related to some upstream functionality than to remove-node itself, but the author did provide a good bit of context:

commit 5aaa6a328c8aaf0ae8ea7e5c48736eb2f1825392
Author: zalky <zalan.k@gmail.com>
Date:   Mon May 24 13:08:30 2021 -0400

    Fix TNS-6: fully remove ns from deps graph

    The clojure.tools.namespace.dependency/remove-node fn removes a
    namespace's :dependencies set from the graph (outgoing edges), but
    does not remove that namespace from the :dependents set of other
    namespaces (ingoing edges). If a namespace is removed and one of its
    dependencies is changed at the same time, as commonly occurs when
    switching branches, then the removed namespace is still in the
    :dependents of the changed namespace when it is reloaded. As all the
    of the dependents of a changed namespace are also reloaded, this
    throws a missing namespace error.

    The resolution is to use clojure.tools.namespace.depedency/remove-all
    to remove both outgoing (:dependencies) and ingoing (:dependents)
    edges in the dependency graph.

    However, care must be taken when choosing the point of change, because
    clojure.tools.namespace.track/add relies on the partial removal
    behaviour of clojure.tools.namespace.track/remove-deps. It uses
    remove-deps to wipe the dependencies of a node, and cannot know how to
    re-insert dependents if lost. Therefore the point of change should be
    in clojure.tools.namespace.track/remove.

    Signed-off-by: Fogus <mefogus@gmail.com>


I am now leaning toward your suggestion of a new operation since there is this precedence of not changing the remove-node function.

Another possible solution that would solve the heart of the problem (repl/scan) would be to update the remove-deps function in clojure.tools.namespace.track with something like this:

(defn- remove-edges [deps node]
  (->> (dep/immediate-dependencies deps node)
       (reduce #(dep/remove-edge %1 node %2) deps)))

(defn- remove-deps [deps names]
  (reduce remove-edges deps names))
...