Share your thoughts in the 2021 Clojure Community Survey!

Welcome! Please see the About page for a little more info on how this works.

+1 vote
in Libs by
edited by

While attempting to retrieve files from S3 within a Lambda function using Cognitect's aws-api, I get unexpected slow behavior. Sample code below was used to get a general sense of the time each relevant operation takes. Getting the client (aws/client {:api :s3}) consistently takes over 4 seconds. As a point of comparison, the same code executing from my local development environment accessing the same S3 bucket/key, executes without the same noticeable delay as when it executes within Lambda.

I should note that the namespace in which this function resides leverages the gen-class facility.

(defn read-s3-file
  [bucket obj-key]
  (let [_ (println (str "-----> Before declaring s3 client: " (java.time.LocalDateTime/now)))
        s3 (aws/client {:api :s3})
        _ (println (str "-----> After declaring s3 client: " (java.time.LocalDateTime/now)))
        _ (println (str "---> bucket: " bucket ", key " obj-key))
        body  (:Body (aws/invoke s3 {:op :GetObject :request {:Bucket bucket :Key obj-key}}))
        _ (println (str "-----> After GetObject: " (java.time.LocalDateTime/now)))
        ]
    (println (str "-----> Before calling slurp: " (java.time.LocalDateTime/now)))
    (slurp body)
    )
  )

Would this be caused by the java8 runtime required by Lambda?

Local output:

clj-lead-updater.core=> (clj-lead-updater.s3/read-s3-file "automattix-incoming-emails" "tdf10001AnsSvc/0dqsmtvh5mquq3tdhpq550lnqcp6162f0k29da01")
-----> Before declaring s3 client: 2019-11-18T18:52:27.611925
-----> After declaring s3 client: 2019-11-18T18:52:27.638097
---> bucket: automattix-incoming-emails, key tdf10001AnsSvc/0dqsmtvh5mquq3tdhpq550lnqcp6162f0k29da01
-----> After GetObject: 2019-11-18T18:52:28.096044
-----> Before calling slurp: 2019-11-18T18:52:28.096431

The two relevant measures are:

-----> Before declaring s3 client: 2019-11-18T18:52:27.611925
-----> After declaring s3 client: 2019-11-18T18:52:27.638097

Lambda output:

-----> Before turning input into a map: 2019-11-17T02:27:08.705
-----> Before declaring s3 client: 2019-11-17T02:27:08.781
2019-11-17 02:27:10.641:INFO::main: Logging initialized @8431ms to org.eclipse.jetty.util.log.StdErrLog
-----> After declaring s3 client: 2019-11-17T02:27:12.923
-----> After GetObject: 2019-11-17T02:27:14.446
-----> Before calling slurp: 2019-11-17T02:27:14.446

And, again, the two relevant measures are:

-----> Before declaring s3 client: 2019-11-17T02:27:08.781
-----> After declaring s3 client: 2019-11-17T02:27:12.923

Relevant dependencies in the project:

  [org.clojure/clojure "1.10.1"]
  [com.cognitect.aws/api "0.8.391"]
  [com.cognitect.aws/endpoints "1.1.11.670"]
  [com.cognitect.aws/s3 "770.2.568.0"]

2 Answers

+1 vote
by

When an aws api client fetches credentials locally, it typically pulls them right from your environment, a credentials file, or system properties, all of which are very fast operations. Running in AWS, it's going to fetch credentials over http, which will naturally be slower than file system operations. 4 seconds faster? Probably not, so I'd guess there's something else at play as well, but certainly this would be a factor.

I won't have time to research this for at least a couple of days. If you have the time, please post the output you're seeing locally and in the lambda.

Thanks!

by
Thank you for replying so quickly. I updated the original question with the output from both environments and with the (relevant) dependencies referenced in the project.

Thanks again!
by
Follow up: It is very likely the performance differences are entirely due to cold-starts. After several invocations of the Lambda function, the performance of the function in question (creating the aws client, which fetches the credentials), approaches the timing measured when the same code is exercised locally via the repl.
+1 vote
by

I have been seeing the same thing. There seems to be a significant overhead in creating a the client for the first time in AWS Lambda environment and using it to execute a request. And this overhead doesn't seem to be related to Lambda cold start time.

For example using 512M memory for lambda creating S3 client and executing 1 metadata fetch takes about 13 seconds. For comparison doing the same thing with aws-java-sdk (v1) in clojure takes about 11-12 second and same thing with nodejs lambda takes less than a second. Increasing Lambda memory size will make things work much faster.

I have a test setup where the third invocation of the lambda creates the client and executes S3 request. Invocations before that do nothing.
1. invocation of the lambda is the cold start and it does nothing, and it is slow (e.g 4 seconds)
2. invocation of the lambda is a warm start and it does nothing => fast (e.g 0.002 seconds)
3. invocation is a warm start and creates the client for the first time => slow (e.g 13 seconds)
4. invocation and following invocations => fast (e.g 0.8 seconds)

When running as a AWS Lambda the credentials are available as environment variables so accessing them should be fast.

AWS published java11 runtime for Lambdas a few days ago, but using it doesn't seem to affect the execution time that much.

All my code is AOT compiled.

...