Welcome! Please see the About page for a little more info on how this works.

0 votes
in Compiler by

It is found that when java agent (that modifies bytecode via ClassFileTransformer) tries to access the "this" reference on clojure function exit, the "this" is no longer available.

This is likely triggered by code change in https://github.com/clojure/clojure/blame/b19b781b1f0f3f46aee5e951f415e0456a39cbcb/src/jvm/clojure/lang/Compiler.java#L5946 which removes the "this" reference in the generated bytecode before the method exits.

This is problematic for java agent instrumentation as we could technically inject code after the return statement (usually as a finally block), and it's unexpected that the "this" reference is missing since it's still within the scope of the method invocation

Many thanks for your attention in advance!

1 Answer

0 votes
by

This is important for clearing references (potentially to lengthy seq chains) so this is not something that’s unlikely to change.

I don’t understand how clearing the this reference prevents you from injecting a finally block.

by
Thanks for the quick reply!

Perhaps I can explain in this real example from our experimental instrumentation on clojure clout

In particular we are trying to inject code to this route-matches function:
```
(route-matches [_ request]
    (let [path-info (if absolute?
                      (request-url request)
                      (path-info request))]
      (let [groups (re-match-groups re path-info)]
        (when groups
          (assoc-keys-with-groups groups keys)))))
```

The bytecode generated for route-matches is as below (only extract the last few lines - starting from the ` (assoc-keys-with-groups groups keys)` call:

        80: getstatic     #136                // Field const__37:Lclojure/lang/Var; #'clout.core/assoc-keys-with-groups
        83: invokevirtual #120                // Method clojure/lang/Var.getRawRoot:()Ljava/lang/Object;
        86: checkcast     #122                // class clojure/lang/IFn
        89: aload_3
        90: aload_0
        91: getfield      #44                 // Field keys:Ljava/lang/Object;
        94: aconst_null
        95: astore_0
        96: invokeinterface #133,  3          // InterfaceMethod clojure/lang/IFn.invoke:(Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;
       101: goto          106
       104: pop
       105: aconst_null
       106: areturn


Now we attempt to use a bytecode library (javassist) to inject byte code (a `System.out.println(this.toString())` java call) to the end of the method body, it will be as below:

80: getstatic     #136                // Field const__37:Lclojure/lang/Var; #'clout.core/assoc-keys-with-groups
        83: invokevirtual #120                // Method clojure/lang/Var.getRawRoot:()Ljava/lang/Object;
        86: checkcast     #122                // class clojure/lang/IFn
        89: aload_3
        90: aload_0
        91: getfield      #44                 // Field keys:Ljava/lang/Object;
        94: aconst_null
        95: astore_0
        96: invokeinterface #133,  3          // InterfaceMethod clojure/lang/IFn.invoke:(Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;
       101: goto          106
       104: pop
       105: aconst_null
>> start code injection
       106: goto          109
       109: astore        5
       111: getstatic     #489                // Field java/lang/System.out:Ljava/io/PrintStream;
>> inst 114 has problem as aload_0 is not `this` but null
       114: aload_0
       115: invokevirtual #491                // Method toString:()Ljava/lang/String;
       118: invokevirtual #496                // Method java/io/PrintStream.println:(Ljava/lang/String;)V
       121: aload         5
>> finish code injection
       123: areturn

We can see that at instruction 114, it failed because aload_0 is no longer the `this` pointer as expected.

I am not super familiar with the JVM spec and the clojure bytecode emission as I am pretty new to the language :) But based on https://docs.oracle.com/javase/specs/jvms/se7/html/jvms-2.html#jvms-2.6.1:

> On instance method invocation, local variable 0 is always used to pass a reference to the object on which the instance method is being invoked (this in the Java programming language)

I don't think it restricts the change of local variable 0 to null? But I assume it is desirable to keep the variable 0 always as the `this` for the call frame?

Just as this case, when we are using a bytecode injection library, changing the local variable 0 to null causes problem when code is injected at the end of the method body.

Inject code as finally does not make any difference, as javasisst basically just inject the same byte code twice after instr 105 (with and exception table points to the statement after 105). But the `this` is lost already at this point due to the null assignment
by
There are good reasons to be clearing this and that's covered in the ticket (can prevent an OOME in some situations), so as I said, this is not something Clojure's going to stop doing. One thing that would be possible would be to add a compiler flag (like the others in https://clojure.org/reference/compilation#_compiler_options) to disable this clearing.
by
Thanks, as also discussed in https://stackoverflow.com/a/22175055 . that it seems nothing prevents bytecode from overwriting the local var 0 with something else within the method body so I do not think clojure's doing anything wrong atm.

I would say compiler flag is nice as it does give an option for bytecode modification library to work properly with Clojure in certain cases.

As for us, we can always uses workaround like storing the `this` reference on the function entry if we do need access to it on function exit.
...