Welcome! Please see the About page for a little more info on how this works.

+2 votes
in Tools by

Our CI runs failed because of a temporary Clojars blip:

Error building classpath. Could not transfer artifact
rewrite-clj:rewrite-clj:jar:1.1.45 from/to clojars
(https://repo.clojars.org/): Connect to repo.clojars.org:443
[repo.clojars.org/146.75.29.128] failed: Connect timed out 401 Error:
Process completed with exit code 1.

Does it seem like it could be good to add some retries to tools.deps when downloading dependencies to handle temporary failures?

1 Answer

0 votes
by

401 means unauthorized. That doesn't seem like an error we would want to retry?

I'm not sure we can differentiate cases where retry would be worthwhile (and some of this is may be obscured through the Maven libs too).

by
The other aspect is that Maven may try multiple repos looking for the repo that holds the artifact. Adding retries would presumably also either take longer to make that search and/or add a lot of traffic to all repos.
by
We often get similar errors although not 401s in CI. Rerunning the build seems to always fix it.

Downloading: metosin/muuntaja/0.6.8/muuntaja-0.6.8.pom from clojars
...
Error building classpath. Failed to read artifact descriptor for metosin:muuntaja:jar:0.6.8
org.eclipse.aether.resolution.ArtifactDescriptorException: Failed to read artifact descriptor for metosin:muuntaja:jar:0.6.8
    at org.apache.maven.repository.internal.DefaultArtifactDescriptorReader.loadPom(DefaultArtifactDescriptorReader.java:259)
    at org.apache.maven.repository.internal.DefaultArtifactDescriptorReader.readArtifactDescriptor(DefaultArtifactDescriptorReader.java:175)
    at org.eclipse.aether.internal.impl.DefaultRepositorySystem.readArtifactDescriptor(DefaultRepositorySystem.java:255)
    at clojure.tools.deps.alpha.extensions.maven$fn__1061.invokeStatic(maven.clj:132)
    at clojure.tools.deps.alpha.extensions.maven$fn__1061.invoke(maven.clj:122)
    at clojure.lang.MultiFn.invoke(MultiFn.java:244)
    at clojure.tools.deps.alpha$expand_deps$children_task__754$fn__756$fn__757.invoke(alpha.clj:406)
    at clojure.tools.deps.alpha.util.concurrent$submit_task$task__479.invoke(concurrent.clj:35)
    at clojure.lang.AFn.call(AFn.java:18)
    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: org.eclipse.aether.resolution.ArtifactResolutionException: Could not transfer artifact metosin:muuntaja:pom:0.6.8 from/to clojars (https://repo.clojars.org/): /ci/.m2/repository/metosin/muuntaja/0.6.8/muuntaja-0.6.8.pom.part (No such file or directory)
    at org.eclipse.aether.internal.impl.DefaultArtifactResolver.resolve(DefaultArtifactResolver.java:425)
    at org.eclipse.aether.internal.impl.DefaultArtifactResolver.resolveArtifacts(DefaultArtifactResolver.java:229)
    at org.eclipse.aether.internal.impl.DefaultArtifactResolver.resolveArtifact(DefaultArtifactResolver.java:207)
    at org.apache.maven.repository.internal.DefaultArtifactDescriptorReader.loadPom(DefaultArtifactDescriptorReader.java:244)
    ... 12 more
Caused by: org.eclipse.aether.transfer.ArtifactTransferException: Could not transfer artifact metosin:muuntaja:pom:0.6.8 from/to clojars (https://repo.clojars.org/): /ci/.m2/repository/metosin/muuntaja/0.6.8/muuntaja-0.6.8.pom.part (No such file or directory)
    at org.eclipse.aether.connector.basic.ArtifactTransportListener.transferFailed(ArtifactTransportListener.java:52)
    at org.eclipse.aether.connector.basic.BasicRepositoryConnector$TaskRunner.run(BasicRepositoryConnector.java:369)
    at org.eclipse.aether.util.concurrency.RunnableErrorForwarder$1.run(RunnableErrorForwarder.java:75)
    at org.eclipse.aether.connector.basic.BasicRepositoryConnector$DirectExecutor.execute(BasicRepositoryConnector.java:628)
    at org.eclipse.aether.connector.basic.BasicRepositoryConnector.get(BasicRepositoryConnector.java:262)
    at org.eclipse.aether.internal.impl.DefaultArtifactResolver.performDownloads(DefaultArtifactResolver.java:514)
    at org.eclipse.aether.internal.impl.DefaultArtifactResolver.resolve(DefaultArtifactResolver.java:402)
    ... 15 more
Caused by: java.io.FileNotFoundException: /ci/.m2/repository/metosin/muuntaja/0.6.8/muuntaja-0.6.8.pom.part (No such file or directory)
    at java.base/java.io.FileInputStream.open0(Native Method)
    at java.base/java.io.FileInputStream.open(FileInputStream.java:219)
    at java.base/java.io.FileInputStream.<init>(FileInputStream.java:157)
    at org.eclipse.aether.internal.impl.DefaultFileProcessor.copy(DefaultFileProcessor.java:163)
    at org.eclipse.aether.internal.impl.DefaultFileProcessor.copy(DefaultFileProcessor.java:151)
    at org.eclipse.aether.internal.impl.DefaultFileProcessor.move(DefaultFileProcessor.java:252)
    at org.eclipse.aether.connector.basic.BasicRepositoryConnector$GetTaskRunner.runTask(BasicRepositoryConnector.java:482)
    at org.eclipse.aether.connector.basic.BasicRepositoryConnector$TaskRunner.run(BasicRepositoryConnector.java:364)
    ... 20 more
by
the big stacktrace above is due to a missing local file, not a problem finding a network resource:

   Caused by: java.io.FileNotFoundException: /ci/.m2/repository/metosin/muuntaja/0.6.8/muuntaja-0.6.8.pom.part (No such file or directory)

hard to say with knowing more about the ci setup, but maybe some kind of problem with a shared /ci filesystem
by
I just went back and looked at the logs and 401 was the line number of the log that I had included when copying.

Here's the logs with timestamps:

00:21:14 GMT Downloading: org/clojure/data.codec/0.1.0/data.codec-0.1.0.jar from central
00:21:14 GMT Downloading: com/amazonaws/jmespath-java/1.11.713/jmespath-java-1.11.713.jar from central
00:21:14 GMT Downloading: clj-commons/pomegranate/1.2.1/pomegranate-1.2.1.jar from clojars
00:21:24 GMT Error building classpath. Could not transfer artifact rewrite-clj:rewrite-clj:jar:1.1.45 from/to clojars (https://repo.clojars.org/): Connect to repo.clojars.org:443 [repo.clojars.org/146.75.29.128] failed: Connect timed out
00:21:24 GMT Error: Process completed with exit code 1.

I think you might want to catch a ConnectTimeoutException if you were going to retry:  https://www.javadoc.io/doc/org.apache.httpcomponents/httpclient/4.3.4/org/apache/http/conn/ConnectTimeoutException.html
by
lol on the 401.

I'm open to this idea if we can distinguish in the right place between cases where the artifact doesn't exist vs where the service is unavailable. It's not clear to me that we have that opportunity and can tell that, so some more investigation is needed.
by
Here is another fairly common error we see in CI:

+ clojure -P
Cloning: https://github.com/seancorfield/build-clj.git
Error building classpath. Unable to fetch /root/.gitlibs/_repos/https/github.com/seancorfield/build-clj
fatal: unable to access 'https://github.com/seancorfield/build-clj.git/': Failed to connect to github.com port 443 after 134828 ms: Connection timed out
error: Could not fetch origin

Possibly adding a 30 second timeout here could be good too.
by
You know I'm not going to be able to resist:

build-clj has been archived for over a year and a half and the readme suggests you don't use it.

Back to the point: Doesn't tools.deps shell out to git? That connection timeout is happening at the command-line level, isn't it?
by
Yes, tools.deps.git just shells out to git so if this is a credentials or timeout issue, that's what you need to fix.
...