Troubleshooting the CLR
Development | Nathan Chappell

Troubleshooting the CLR

Friday, Sep 16, 2022 • 31 min read
Investigating a bug in an obfuscation tool provides a good opportunity to take a closer look at dotnet core.

Intro

We need to fix an obfuscator for dotnet core.

Our project is a desktop dotnet core (net5.0) application. The target platform is windows. Since this is an application we release to our customers, it is also a program that we release to our competitors, therefore we would like to obfuscate the program. We found an open-source obfuscator that, with modification, we found suitable for our purposes. However, it needed to be updated to be used with dotnet, and there have been a handful of interesting bugs, the investigation of which are the topic of this post.

The Issues:

We present the issues by providing a contrived program (MRE) which behaves as expected before obfuscation. The obfuscator is tuned to do very little - it prepends an underscore _ to the names of types and methods, removes the namespace information associated with types, and modifies the names of generic parameters. Compare the assemblies in the screenshot:

Obfuscator Output

We can see that the class ProblematicCode.GenericIn has been transformed in the following way:

Obfuscation Class Name Namespace GenericParams
Before ProblematicCode.GenericIn ProblematicCode InputT
After _ProblematicCode.GenericIn None \u0001

The Problematic Code

We have a program which misbehaves after obfuscation. Here are the interfaces and classes:

// ProblematicCode.cs

namespace ProblematicCode;

internal interface IGenericInOut<InputT, OutputT> { OutputT DoSomething(InputT t2); }
internal interface IGenericIn<InputT> : IGenericInOut<InputT, int> { }
internal class GenericIn<InputT> : IGenericIn<InputT> { public int DoSomething(InputT t2) => default; }

// This is a dummy converter.  We don't care that it works, only that GetConverter finds it
internal class MyConverter<T> : System.ComponentModel.TypeConverter { }

There is a generic type (GenericIn) which implements a generic interface (IGenericIn) with a non-generic method (DoSomething). It is contrived, as promised, however the main pattern that’s being described here is the use of “intermediate interfaces” that partially specialize a more generic interface. A less contrived example may be something like:

interface StringKeyDict<T> : IDictionary<string, T> {}

With such a “partial specialization” you restrict implementations to providing a dictionary which accepts strings as keys.

This is the “main” part of the program where the code gets used:

// Program.cs

// We check for the converter before adding the TypeConverterAttribute
var converterBeforeAttribute = TypeDescriptor.GetConverter(typeof(GenericIn<string>));
Console.WriteLine($"{nameof(converterBeforeAttribute)}: {converterBeforeAttribute.GetType().FullName}");

// After adding this TypeConverterAttribute, TypeDescriptor.GetConverter should find MyConverter
TypeDescriptor.AddAttributes(
    typeof(GenericIn<string>),
    new TypeConverterAttribute(typeof(MyConverter<string>))
);

// We check for the converter after adding the TypeConverterAttribute
var converterAfterAttribute = TypeDescriptor.GetConverter(typeof(GenericIn<string>));
Console.WriteLine($"{nameof(converterAfterAttribute)}: {converterAfterAttribute.GetType().FullName}");

TypeDescriptor.GetConverter is used to retrieve a converter for GenericIn<string>. The interesting point is that there is an attribute added to the type at runtime - the converter is not found before the attribute is added, but should be found afterwards.

  1. The first issue is that after obfuscation the program fails to run and throws a MissingMethodException with the following output:
Unhandled exception. System.MissingMethodException:
    Method not found: 'Int32 _ProblematicCode.IGenericInOut._DoSomething(!0)'.
   at _Program._<Main>$(String[] )
  1. The second issues arises after fixing the first one. After we fix the first issue, TypeDescriptor.GetConverter fails to find a converter for GenericIn<string>, even after adding the attribute.
UNOBFUSCATED

converterBeforeAttribute: System.ComponentModel.TypeConverter
converterAfterAttribute: ProblematicCode.MyConverter`1[[System.String, System.Private.CoreLib, Version=7.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]]
OBFUSCATED

converterBeforeAttribute: System.ComponentModel.TypeConverter
converterAfterAttribute: System.ComponentModel.TypeConverter

Requirements

Make it work.

This is worth dwelling on. Often times when fixing a bug, we can verify that we’ve fixed the problem by determing what section of the code is causing the undesired behavior, modifying the code, and observing that the undesired behavior is gone for a reasonable sample of inputs. We have a somewhat different paradigm. The program we are modifying (the obfuscator) is actually modifying another program, and it is the behavior of the other program that we are interested in. You could argue that there is not really a distinction in principle, but in practice this difference can make life interesting. Note that we have not found any issues with the logical transformations made by the obfuscator. The issues we have found are invalid transformations made to the “metadata” of the .dll of interest (technically, all data in a .dll is metadata, but we are referring specifically to non-code metadata).

The implication of this is that, it’s not really a “quick-fix” scenario, it’s more of a “build a base of knowledge about Common Language Infrastructure (CLI) metadata and hope you can figure out why it’s not working.” Without an understanding of what could possibly even be wrong, we probably aren’t going to be able to fix the tool that’s causing the problem. Therefore, the goal of this post is to provide an introduction to the type of knowledge required to fix the issues, and less about fixing the issues themselves.

Solution

To understand the first bug, we will need to do a quick “refresher” of the Common Intermediate Language (CIL, sometimes Microsoft Intermediate Language or MSIL). This gives us an opportunity to give a quick overview of the CLI. We will use standard tools (ilasm/ildasm, ilspy) to examine the metadata, and determine what is going wrong and why.

The second bug becomes a bit more technically challenging. We will be troubleshooting the Common Language Runtime (CLR). Unfortunately, we cannot get away with only inspecting the C# standard libraries: we will also need to flex our C++ muscles and do some troubleshooting of the actual host.

Technical Details

Before We Start…

To follow along with this article you will need to have your own build of the dotnet runtime. Since it takes a non-trivial amount of time to build, you may wish to get this started now, so it will be ready by the time you need it. Here is how you might build it using powershell:

git clone --depth 1 https://github.com/dotnet/runtime
Set-Location .\runtime
.\build.cmd -arch x64 -s clr+libs+host

For more details about building, use runtime\build.cmd -h, or check out runtime\docs\coding-guidelines\project-guidelines.md.

Throughout the post we will refer to files and folders relative to the runtime source code like: runtime\path\to\something.ext.

Prerequisites / References / Tools:

  • dotnet-core: if you don’t have this then this article is probably not interesting to you
  • ilspy: this is the “standard” tool for inspecting assemblies. You can find a copy here
  • ilasm/ildasm: these are “standard” IL assembler/disassembler tools. You have access to them through the Developer Command Prompt for Visual Studio. You can activate the command prompt in powershell by executing (for example):
& "C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\Tools\Launch-VsDevShell.ps1" -Arch amd64 -HostArch amd64
  • ECMA-335: This is the CLI Specification. I would recommend that any C# developer with over one year of experience should at least skim through Partition I. If you choose to read it all the way through, note that the dotnet implementation diverges in a few documented places, and the modern ilasm assemblers may use a slight superset of the syntax and functionality described there.
  • Your own obfuscator. The obfuscator we are using is being internaly developed, so you will need to find your own. But, the only functionality of the obfuscator that is used in this article is renaming and removing the namespace associations from the metadata.

Intro to the CLI

ECMA-335 is the specification of the Common Language Infrastructure (CLI). Briefly, this document describes what assembly files are: their contents, their layout, and their semantics - how they are to be interpreted (i.e. executed as a program). From this point on, unless otherwise specified, any reference like I.12 is referring to ECMA-335. I.12 specifically means “partition I, section 12. Before we continue, there are myriad acronyms that should be clarified.

  • CLI - Common Language Infrastructure A set of specifications and concepts, including VES, CIL, CTS, CLS, Metadata, and some others. For example, dotnet is an an implementation of the CLI. All of these specifications are defined in ECMA-335.
  • VES - Virtual Execution System It’s best to just read the definition from I.12:

    The Virtual Execution System (VES) provides an environment for executing managed code. It provides direct support for a set of built-in data types, defines a hypothetical machine with an associated machine model and state, a set of control flow constructs, and an exception handling model. To a large extent, the purpose of the VES is to provide the support required to execute the CIL instruction set (see Partition III).

In other words, this is the abstract machine upon which programs written for the CLI are executed. All instructions specified by the CIL manipulate the state of this machine.

  • CIL - Common Intermediate Language This is an assembler-like language which is defined in ECMA-335 in order to aid in specifying the semantics of Metadata. This is also referred to as IL or MSIL. (One could argue that CIL is precisely the binary encoding described by the specification, and that ilasm should be used to refer to the assembler-like language, but since very few people are likely to be engaged with the actualy binary format, we’ll abuse the terminology.)
  • CTS - Common Type System These are the primitive types supported by the CLI. The CTS also describes the initial type hierarchy, and distinguishes between objects and values.
  • CLS - Common Language Specification This is a set of rules that implementations the CLI should satisfy, whose purpose is “to promote language interoperability.” An example is given below.
  • CLR - Common Language Runtime This consists of the VES and CLS. More practically, this consists of the native (i.e. C++) code which actually exectues an assembly, and the internal code necessary to implement the API provided to users of the CLI.

Informal Remarks

If the notion of intermediate language (IL) is new to you, then the following remarks may be helpful.

Consider what happens when you compile a C program. The source code is transformed into an object file. This file contains enough information for your operating system’s runtime-loader to execute your program. However, for PE32+ (windows) and ELF (linux), these object files contain machine-code, that is, a sequence of bytes which represent instructions for the processor to execute. Assemblies are similar to object files - the have a prescribed layout which defines their semantics, and they have “instructions” which are to be executed as a program. The main difference is that these “instructions” are not machine-code for a specific processor. Rather, they are machine code for the VES.

So what does this have to do with IL?

Well, the CLI really defines an “abstract computer” of sorts - the VES. This system provides an environment and an instruction set which the system will interpret to manipulate the environment in a precise way - similar to what you would find in a consumer computer. You have the ability to manipulate a stack, fetch and store values from and to memory, and you can call methods available in properly referenced assemblies (including your own). These low-level instructions are called IL. Their organization with metadata in the PE32+ format is called an assembly. One purpose of having such a system is portability: theoretically, an assembly compiled on any machine should run on any other machine for which there is a conforming implementation of the CLI (e.g. your dotnet program built on windows will run with dotnet on linux). Another justification for such a system is separation of concerns: libraries can be updated and maintained independently of OS-specific implementations.

That takes care of CLI, VES, and IL. There is not much else to say about the CTS, the interested reader should just peruse ECMA-335 I.8. We’ll get more intimate with the CLR later, which just leaves the CLS. The CLS is not particularly relevant to our current task, but it may shed light on some of C#’s idiosyncracies and conventions. It is a set of rules that implementers of the CLI should follow. Here is an example, which might be interesting to C# programmers, which has to do with naming generic types:

I.10.7.2: CLS-compliant generic type names are encoded using the format “name[`arity]”

CLS Rule 43: The name of a generic type shall encode the number of type parameters declared on the non-nested type, or newly introduced to the type if nested, according to the rules defined above.

GenericTypeNameEncoding

This rule exists to provide “overloads” for generic types (and, less relevantly, richer information for nested types). More specifically, consider the following C# snippet:

class Foo<T> {}
class Foo<T,U> {}

C# allows you to “overload” generic types in such a manner (they have the same name, but which type is used in some instantiation depends on the number of generic arguments provided), but the compiler encodes the names as indicated by the rule.

.class private auto ansi beforefieldinit Foo`1<T> extends [System.Runtime]System.Object
.class private auto ansi beforefieldinit Foo`2<T,U> extends [System.Runtime]System.Object

Actual “overloading” of the type based on number of parameters is not possible. A name identifies a type, and does not include the generic parameters.

Bug 1

Now, back to the task at hand.

We’ve built our program, and obfuscated the code. Trying to run the program after obfuscation leads to a MissingMethodException being thrown by the runtime. Let’s look at the IL of the obfuscated program (you can tell it has been obfuscated because the type names have a leading underscore _).

This is the interesting output of the command ildasm.exe /text /out=out_file.il path/to/obfuscated.dll:

.class interface private abstract auto ansi _ProblematicCode.IGenericInOut<T,U>
{
  .method public hidebysig newslot abstract virtual
          instance !U  _DoSomething(!T A_1) cil managed
  {
  } // end of method _ProblematicCode.IGenericInOut::_DoSomething
} // end of class _ProblematicCode.IGenericInOut

/* ... */

.class private auto ansi beforefieldinit _ProblematicCode.GenericIn<T>
       extends [System.Runtime]System.Object
       implements class _ProblematicCode.IGenericInOut<!T,int32>,
                  class _ProblematicCode.IGenericIn<!T>
{
  .method public hidebysig newslot virtual final
          instance int32  _DoSomething(!T A_1) cil managed
  {
    .override  method instance int32 class _ProblematicCode.IGenericInOut<!T,int32>::_DoSomething(!0)
    .maxstack  8
    IL_0000:  ldc.i4.0
    IL_0001:  ret
  } // end of method _ProblematicCode.GenericIn::_DoSomething

  /* ... */

} // end of class _ProblematicCode.GenericIn

Let’s take this a chunk at a time.

Definition of the Generic Interface

.class interface private abstract auto ansi _ProblematicCode.IGenericInOut<T,U>
{

So the first thing you will notice about IL is that it is extremely verbose. But, we have a serious task at hand, so let’s do this by the book.

.class

We see in II.5.10 that the production .class ClassHeader { ClassMember* } is one of many declarations at the top level of an ILFile, or ilasm source file. We conclude that interface private abstract auto ansi _ProblematicCode.IGenericInOut<T,U> is the ClassHeader. We turn to II.10. We see that a Type Header (a.k.a. ClassHeader) consists of attributes, a name (Id), an optional base type (default System.Object), and an optional list of implemented interfaces.

The attributes of this class declaration are:

  • interface: Declares an interface. Section II.12 describes the semantics of interfaces. The only thing surprising here to a C# programmer is that CLI interfaces can have static fields and methods. CLS Rule 19, however, states that CLS-compliant interfaces shall not define static methods, nor shall they define fields. So it turns out that the only thing you can really do with an interface in C# is declare abstract virtual functions in order to define a contract.
  • private: this defines the visibility of the type. I.8.5.3.1 tells us that visibility determines if the type is exported from the assembly (i.e. can be used from another assembly). A private type may often be called an internal type, and corresponds to this C# concept.
  • abstract: (II.10.1.4) specifies that this type shall not be instantiated. It also says that classes with abstract methods shall be declared abstract.
  • auto: this specifies the layout of the type (II.10.1.2). Since this is an interface, this notion is irrelevant, and auto is the reasonable default value (it’s explicit in this code because it was written by a disassembler).
  • ansi: specifies that marshalling shall be to and from ANSI strings (II.10.1.5). Another sensible default, similar to the above.
  • _ProblematicCode.IGenericInOut: This is the Id. Notice that there is no `2 at the end of the name - implying that our obfuscator does not generate CLS-compliant type names. This is not an issue for us, but noteworthy. More commentary on Ids below.
  • <T,U>: these are the generic parameters of the type.

Id

The CLI/CLS rules concerning identifiers is given in I.8.5, but it is not very enlightening. Basically an almost arbitrary sequence of bytes can be a valid name, but CLS Rule 4 restricts this to what’s recommended by Unicode Normalization Forms: Programming Language Identifiers. The format given there is fairly unsurprising: an identifier cannot start with a digit or underscore. Remember that the CLS rules are mainly meant to apply to a library or the CLI API.

In my opinion, more interesting and useful for our purposes is how identifiers are specified according to the ilasm grammar. II.5.2/II.5.3 describes an ID as what is typically expected (plus a few more characters): ([A-Za-z_$@`?][a-za-z0-9_$@`?]*). Then it describes Id as either an ID or an SQSTRING (single-quoted string). Many times, compiler generated names of types must be written as a SQSTRING. We can see an example of this in the default console app for net6.0 (which uses *Top-Level Statements*). When I run:

  dotnet new console;
  dotnet build;
  ildasm.exe /text .\bin\Debug\net6.0\<Program Name>.dll;

I can see the following declaration:

.class private auto ansi beforefieldinit Program
       extends [System.Runtime]System.Object
{
  .custom instance void [System.Runtime]System.Runtime.CompilerServices.CompilerGeneratedAttribute::.ctor() = ( 01 00 00 00 )
  .method private hidebysig static void  '<Main>$'(string[] args) cil managed
  {
    .entrypoint

Here we can see that the compiler generated a class named Program. We know it was generated because we didn’t declare it, but the CompilerGeneratedAttribute gives us a hint (we don’t know the precise semantics of this attribute, but it seems likely it means what we think it means). More interesting is the method inside the generated type, '<Main>$'. Note that this method is somewhat aptly named, since the .entrypoint directive tells us that this is the entrypoint to the program (in C#, this directive is added by convention to the unique public static method named Main with appropriate signature). Note that the name is required to be an SQSTRING due to the presence of < and >.

Digging further into the ilasm identifier syntax, in the runtime source code under runtime\src\coreclr\ilasm\asmparse.y, we see their grammar for classHead:

dottedName              : id                                  { $$ = $1; }
    | DOTTEDNAME                          { $$ = $1; }
    | dottedName '.' dottedName           { $$ = newStringWDel($1, '.', $3); }
                        ;

classHead               : classHeadBegin extendsClause implClause               { PASM->AddClass(); }
classHeadBegin          : _class classAttr dottedName typarsClause              { if($4) FixupConstraints();

So it seems that ilasm allows a dottedName in the ClassHeader (that is, any sequence of Ids joined with '.'). See my earlier remark about ilasm not being strictly what is described in ECMA-335. The fact that dottedName appears in the ClassHeader actually has some interesting implications, one of which we will see later.

.method

The next interesting declaration in our type definition is:

  .method public hidebysig newslot abstract virtual instance !U  _DoSomething(!T A_1) cil managed

II.15 describes the production for defining methods: .method MethodHeader { MethodBodyItem* }, and in II.15.4 the production for MethodHeader: MethAttr* [CallConv] Type [marshal ([NativeType]) ] MethodName [ < GenPars > ] ( Parameters ) ImplAttr*. The definition of CallConv is given in II.15.3, the point is that instance is a CallConv and indicates that a this pointer will be passed to the method. So, our attributes are:

  • public: this describes the visibility of the method. I.8.5.3.2 says that public implies accessible to all referents. In other words, it means what you think it means.
  • hidebysig: (II.15.4.2.2) is used by tools and not by the VES. Described in more detail below.
  • newslot: (II.10.3.1) If the definition of a virtual method is marked newslot, then it creates a new virtual method and does not override a base class method.
  • abstract: (II.15.4.2.4) abstract shall only be used with virtual methods that are not final. It specifies that an implementation of the method is not provided but shall be provided by a derived class. abstract methods shall only appear in abstract types.
  • virtual: this means what you think it means, but for reference here is what’s written in II.15.2:

Virtual methods are associated with an instance of a type in much the same way as for instance methods However, unlike instance methods, it is possible to call a virtual method in such a way that the implementation of the method shall be chosen at runtime by the VES depending upon the type of object used for the this pointer. The particular Method that implements a virtual method is determined dynamically at runtime (a virtual call) when invoked via the callvirt instruction; whilst the binding is decided at compile time when invoked via the call instruction (see Partition III).

hidebysig

We’ll take a moment to look at the definition of hidebysig, as it highlights an interesting difference between C# and C++.

hidebysig is supplied for the use of tools and is ignored by the VES. It specifies that the declared method hides all methods of the base class types that have a matching method signature; when omitted, the method should hide all methods of the same name, regardless of the signature.

[Rationale: Some languages (such as C++) use a hide-by-name semantics while others (such as C#, Java™) use a hide-by-name-and-signature semantics. end rationale]

To demonstrate the difference, consider the following C# program:

var derived = new Derived();
derived.F(1);
derived.F();

class Base
{
    public virtual void F()      { Console.WriteLine($"Base.F()");    }
    public virtual void F(int _) { Console.WriteLine($"Base.F(int)"); }
}

class Derived: Base
{
    public override void F()     { Console.WriteLine($"Derived.F()"); }
}

The output of this program is:

Base.F(int)
Derived.F()

Consider a similar program in C++:

#include <iostream>

class FooBase {
public:
    virtual void DoStuff() { std::cout << "FooBase.DoStuff()" << std::endl; }
};

class Foo: FooBase {
public:
    virtual void DoStuff(const std::string &s) { FooBase::DoStuff(); std::cout << "Foo.DoStuff(" << s << ")" << std::endl; }
};

int main(int argc, char **argv) {
    Foo().DoStuff();
}

When I compile this with cl /EHsc /c .\HideByName.cpp, I get the following compiler error:

.\HideByName.cpp(14): error C2660: 'Foo::DoStuff': function does not take 0 arguments

So what went wrong? The rules of C++ name-lookup dictate then when a name is found in a scope, name lookup goes no further, and only the functions with the same name in that scope will be considered for overload resoultion. Take a look at the following quote and visit C++ Reference for more information.

A function with the same name but different parameter list does not override the base function of the same name, but hides it: when unqualified name lookup examines the scope of the derived class, the lookup finds the declaration and does not examine the base class.

C# has a similar stipulation to control name lookup:

ECMA 334: 12.6.4.1 Overload Resolution: … the set of candidates for a method invocation does not include methods marked override (§12.5), and methods in a base class are not candidates if any method in a derived class is applicable (§12.7.6.2).

This has an interesting implication. Consider the following code:

var derived = new Derived();
derived.F(1.0);
derived.F(1);

class Base
{
    public void F(int _)    { Console.WriteLine($"Base.F(int)"); }
    public void F(double _) { Console.WriteLine($"Base.F(double)"); }
}

class Derived: Base
{
    public void F(Foo _)    { Console.WriteLine($"Derived.F(Foo)"); }
}

class Foo
{
    public static implicit operator Foo(double _) => default;
}

With output:

Derived.F(Foo)
Derived.F(Foo)

This output could be surprising, given that Base.F has a method which requires no implicit-conversions to use, but still the compiler chose the “worse” method in the derived class. That’s because it found an applicable method in the derived class.

You may be asking yourself “who cares?” Not you, if you’re never going to use a language other than C#, but it is always useful to be aware of such subtleties. Also not the runtime, apparently, since the attribute is *ignored by the VES**. It is there for tools (ides, compilers) which need to be able to handle multiple languages.

A final comment: in C++, name-lookup and overload-resolution are more cleanly separated, but that is no longer the case in C# - the names available depend on the semantics of the names themselves. (Technically, some amount of semantic information is taken into account when doing name lookup in C++, but it quite minimal in comparison - read about dependent names for more information.)

Signatures and Generics

Recall the method declaration under inspection:

  .method public hidebysig newslot abstract virtual instance !U  _DoSomething(!T A_1) cil managed

The attribute cil means that the method body consists of cil code, as opposed to native or runtime (see II.15.4.3.1 for more details). II.15.4.3.2 states that cil methods are managed (i.e. executed by the VES, as opposed to code which operates outside the virtual machine). _DoSomething is obviously the MethodName, which means that !U is the (return) Type, and !T A_ is the Parameters. The meaning of these should be obvious to any programmer, but the syntax may take you aback. II.9 discusses generics in some detail, and you can see the syntax used to reference generic parameters of the enclosing type. In general, in the CIL Metadata you must refer to generic parameters by an index into an appropriate table. In ilasm, we refer to generic parameters with ! or !!: the single exclamation mark ! indicates that we are referring to a generic type parameter, while the double !! refers to a generic method parameter (note that a non-generic method may refer to generic parameters with one !, a generic method in an non-generic class may refer to generic parameters with !!, a generic method in a generic class could use either ! or !!, and a non-generic method in a non-generic class won’t be able to use either (unless nested…))

In the ilasm grammar the following productions exist:

type :
    | '!' '!' int32
    | '!' int32
    | '!' '!' dottedName
    | '!' dottedName

So we can refer to generic parameters by either index or dottedName (a convenience offered by ilasm not described in ECMA-335), which explains why we see !T and !U instead of !0 and !1 (we would see the actual parameter names had the obfuscator not destroyed them).

This concludes our discussion of the following declarations:

.class interface private abstract auto ansi _ProblematicCode.IGenericInOut<T,U>
{
  .method public hidebysig newslot abstract virtual instance !U _DoSomething(!T A_1) cil managed
  {
  } // end of method _ProblematicCode.IGenericInOut::_DoSomething
} // end of class _ProblematicCode.IGenericInOut

Troubleshooting the IL

Now we turn to the ilasm for our implementation of the interface, and attempt to identify what went wrong.

.class private auto ansi beforefieldinit _ProblematicCode.GenericIn<T>
       extends [System.Runtime]System.Object
       implements class _ProblematicCode.IGenericInOut<!T,int32>,
                  class _ProblematicCode.IGenericIn<!T>
{
  .method public hidebysig newslot virtual final
          instance int32  _DoSomething(!T A_1) cil managed
  {
    .override  method instance int32 class _ProblematicCode.IGenericInOut<!T,int32>::_DoSomething(!0)
    .maxstack  8
    IL_0000:  ldc.i4.0
    IL_0001:  ret
  } // end of method _ProblematicCode.GenericIn::_DoSomething

  /* ... */

} // end of class _ProblematicCode.GenericIn

There’s only one thing new to us here:

  • .override: II.15.4.1 gives the production .override TypeSpec :: MethodName, states use current method as the implementation for the method specified, and indicates we should look at II.10.3.2 for more information. That section states (emphasis mine):

The .override directive specifies that a virtual method shall be implemented (overridden), in this type, by a virtual method with a different name, BUT WITH THE SAME SIGNATURE. This directive can be used to provide an implementation for a virtual method inherited from a base class, or a virtual method specified in an interface implemented by this type.

Now we know everything we need to solve this issue.

Solution to Bug 1

Let’s take a look at the interface method declaration and the .override directive.

  .override method instance int32 class _ProblematicCode.IGenericInOut<!T,int32>::_DoSomething(!0)
  .method public hidebysig newslot abstract virtual instance !U _DoSomething(!T A_1) cil managed

Do you see the issue? The .override directive specifies that the “current” method should override a method as described in the following table.

Method Name Parent Type Return Type Parameters
_DoSomething _ProblematicCode.IGenericInOut<!T,int32> int32 (!0)

Of course, this method does not exist! The declared method in _ProblematicCode.IGenericInOut<!T,int32> is described as follows:

Method Name Parent Type Return Type Parameters
_DoSomething _ProblematicCode.IGenericInOut<!T,!U> !T (!U)

If we recall the remarks about how generic parameters are “really” referred to by their index in metadata, then we know that the above method is the same as:

Method Name Parent Type Return Type Parameters
_DoSomething _ProblematicCode.IGenericInOut<!T,!U> !1 (!0)

We conclude that our obfuscator has erroneously replaced !1 with int32 in the return type of the method signature. This seems like a “reasonable” mistake to make. We can test our theory by making the following change to the il:

    - .override  method instance int32 class _ProblematicCode.IGenericInOut<!T,int32>::_DoSomething(!0)
    + .override  method instance !1 class _ProblematicCode.IGenericInOut<!T,int32>::_DoSomething(!0)

Recompile with ilasm.exe /dll /output=path/to/obfuscated.dll out_file.il, and it should work. I get the following output:

converterBeforeAttribute: System.ComponentModel.TypeConverter
converterAfterAttribute: _ProblematicCode.MyConverter[[System.String, System.Private.CoreLib, Version=7.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]]

Which is exactly what we were expecting. Let’s go fix our obfuscator, rebuild our project, and see what happens.

Spongebob

Now here’s the output I get:

converterBeforeAttribute: System.ComponentModel.TypeConverter
converterAfterAttribute: System.ComponentModel.TypeConverter

So, fixing the obfuscator made it so our program can in fact run. But now TypeDescriptor.GetConverter is not finding our converter! What’s even stranger, the obfuscated program worked correctly after building it with ilasm. This strange behavior is our next issue.

Bug 2

At this point we want to know why TypeDescriptor.GetConverter is not able to find our converter. An obvious thing to do is look at our assembly using ilspy and making sure that the type is actually there.

IlSpy

Okay, so the type is there. At this point, my best idea is to try to debug the internals of TypeDescriptor.GetConverter. In this section we are going to look at three “levels” of debugging the CLR:

  1. Debugging the production libs
  2. Debugging our own built libs
  3. Rebuilding System.CoreLib.Private and modifying the native code

Each level is increasingly more sophisticated, difficult, and time consuming, but level 3 is about as deep as you can go, and you will ultimately be able to answer any question you have about the behavior of dotnet if you go this far.

Level 1.

This is the most straightforward and easy way to try to get some insight into the internals of the CLR, but it is also the least likely to be successful.

I add the following line to the code:

  + System.Diagnostics.Debugger.Launch();
    // We check for the converter after adding the TypeConverterAttribute
    var converterAfterAttribute = TypeDescriptor.GetConverter(typeof(GenericIn<string>));

After adding this line, when I rebuild and run the project I’ll get a popup asking me if I want to open up Visual Studio to debug the program. I say yes, and the first thing I do is change the Debug settings. I need to disable Just My Code.

JustMyCode

After doing this you may be prompted to download symbols from some symbol server, go ahead and do it. You can play around at this level and see what works and what doesn’t. You’ll notice that you can’t step into every method, and sometimes stepping sends you to a place you weren’t expecting. That is perhaps because you are stepping through optimized code, and it is not suitable for debugging. For example, the compiler may have emitted instructions that don’t correspond directly to the source code, or may not have left suitable “sequence points” for the debugger to truly 1-step the instructions. The last time I tried, I couldn’t even step into TypeDescriptor.GetConverter, although I did have some luck with some other reflection routines.

Level 2.

Now we will modify and use the libs that we’ve built. We will not modify System.Private.CoreLib, at least not yet, but everything else in the System namespace is fair game. What is a reasonable workflow for modifying and using the libs we’ve built? Here is what worked for me:

  1. In the project, update the .csproj file to have the lines:
    <RuntimeIdentifier>win-x64</RuntimeIdentifier>
    <TargetFramework>net7.0</TargetFramework>
  1. Add the newly built dotnet cli located at runtime/.dotnet/dotnet.exe to my PATH, or make it otherwise accessible. For the rest of these instructions, this newly built dotnet cli will be referred to as dotnet'. This is sometimes called the dogfooding version of dotnet.
  2. Make any modifications to the standard libararies I desire. These are all the subprojects in runtime\src\libraries.
  3. Use dotnet' to rebuild the solution. For example, if my current working directory was runtime\src\libraries\System.Console, I could run ..\..\..\.dotnet\dotnet.exe build .\src\System.Console.csproj /p:BuildTargetFramework=net7.0-windows to rebuild that library without having to rebuild the entire standard library.
  4. Use dotnet' to build MY project (I mean, the code I want to use to test the new libraries). This works just like a normal dotnet cli. Note that at the time you’re doing this, net7.0 may or may not be appropriate to target the latest libraries you’ve built. You may have to determine suitable changes to make it work correctly.
  5. Copy all the built libraries into the .\bin\Debug\net7.0\win-x64 build directory of my project. This is not quite trivial, I describe how you might do it later.
  6. Debug my project. I should be able to step through the standard libraries as if it were my very own code.
  7. Goto (3)

Comments on Level 2.

I recommend you take a look at runtime\docs\workflow\testing\using-your-build.md. At any rate, be prepared for things not to go exactly as you expect, and pay attention to build output logs - they may be giving you the exact information you need to accomplish what you want (I’ve found this to be the case, at least). I also found it very helpful to have a small powershell function to automate (5) and (6). Here is a snippet that demonstrates the main idea:

$RUNTIME_DIR = "C:\Users\natha\programming\cs\runtime"
$DOTNET_CLI = "$RUNTIME_DIR\.dotnet\dotnet.exe"

$PROJECT_BUILD_DIR = ".\src\ProblematicCode\bin\Debug\net7.0\win-x64"
$PROJECT_CSPROJ = "src\ProblematicCode\ProblematicCode.csproj"

$artifactDirs = gci $RUNTIME_DIR\artifacts\bin\System.* | where {
    -not ($_.Name -match "\.Tests")
} | %{
    $artifactDir = gci -Recurse -Directory "net7.0-windows" $_
    if (-not $artifactDir) {
        $artifactDir = gci -Recurse -Directory "net7.0" $_
    }
    $artifactDir
}

$artifacts = $artifactDirs | % {
    gci -Recurse $_ 'System.*.dll';
    gci -Recurse $_ 'System.*.pdb';
} | where {
        (-not ($_.FullName -match "\\ref\\") -and
        -not ($_.FullName -match "Test")) -or
        ($_.FullName -match "Private.CoreLib")
}

function ReplaceLibs {
    Copy-Item -Force ($artifacts | % FullName) $PROJECT_BUILD_DIR
}

function MyBuild {
    & $DOTNET_CLI clean $PROJECT_CSPROJ
    & $DOTNET_CLI build $PROJECT_CSPROJ
    ReplaceLibs
}

Going into the powershell script above is a little out of scope, but the basic idea is to comb through the System.* directories in runtime\artifacts\bin, where much of the results of our build process ends up. In those directories, we look for .dll and .pdb files, and filter out some of the Tests and Private.CoreLib. This should grab the standard library that we’ve just built.

Here are a few screenshots of the debug experience using the built libs:

Debug_GetConverter_1 Debug_GetConverter_2 Debug_GetConverter_3

Using the debugger I was able to arrive at the following line from runtime\src\libraries\ System.ComponentModel.TypeConverter\src\ System\ComponentModel\ReflectTypeDescriptionProvider.ReflectedTypeData.cs:

                        Type? converterType = GetTypeFromName(typeAttr.ConverterTypeName);

At this point, I’m highly suspicious, because I know that the obfuscator has messed with the names of the types somehow, so perhaps this is where I should be looking. In the definition of that function we find:

            private Type? GetTypeFromName(
                // this method doesn't create the type, but all callers are annotated with PublicConstructors,
                // so use that value to ensure the Type will be preserved
                [DynamicallyAccessedMembers(DynamicallyAccessedMemberTypes.PublicConstructors)] string typeName)
            {
                if (string.IsNullOrEmpty(typeName))
                {
                    return null;
                }

                int commaIndex = typeName.IndexOf(',');
                Type? t = null;

                if (commaIndex == -1)
                {
                    t = _type.Assembly.GetType(typeName);
                }

                t ??= Type.GetType(typeName);
                /* ... */
            }

For some peculiar reason, I think that t ??= Type.GetType(typeName); is worth investigating. Now’s a good time to tryout some printf style debugging. Within the standard libraries, this is supported, but you may (probably) need to add a reference to System.Console. For example, from runtime\src\libraries\System.ComponentModel.TypeConverter I run:

..\..\..\.dotnet\dotnet.exe add .\src\System.ComponentModel.TypeConverter.csproj reference ..\System.Console\src\System.Console.csproj

To demonstrate, we will make the following changes to the above code:

                System.Console.WriteLine(new string('*', 40));
                System.Console.WriteLine($"{nameof(t)} before {t?.FullName ?? "<NULL>"}");
                System.Console.WriteLine($"Type.GetType({typeName})");
                t ??= Type.GetType(typeName);
                System.Console.WriteLine($"{nameof(t)} after: {t?.FullName ?? "<NULL>"}");
                System.Console.WriteLine(new string('*', 40));

We run the unobfuscated and obfuscated versions, and get the following output:

UNOBFUSCATED

converterBeforeAttribute: System.ComponentModel.TypeConverter
****************************************
t before <NULL>
Type.GetType(ProblematicCode.MyConverter`1[[System.String, System.Private.CoreLib, Version=7.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]], ProblematicCode, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null)
t after: ProblematicCode.MyConverter`1[[System.String, System.Private.CoreLib, Version=7.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]]
****************************************
converterAfterAttribute: ProblematicCode.MyConverter`1[[System.String, System.Private.CoreLib, Version=7.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]]
OBFUSCATED

converterBeforeAttribute: System.ComponentModel.TypeConverter
****************************************
t before <NULL>
Type.GetType(_ProblematicCode.MyConverter[[System.String, System.Private.CoreLib, Version=7.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]], ProblematicCode, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null)
t after: <NULL>
****************************************
converterAfterAttribute: System.ComponentModel.TypeConverter

So, in the unobfuscated code, the call to Type.GetType succeeds, in the obfuscated code it does not. We’ve used our debugging techniques to determine the point in the program that the behavior of the obfuscated program and the unobfuscated program diverge. Now we just have to dig into Type.GetType, which is a whole ‘nother thing.

Level 3.

So, now we want to play around with Type.GetType. Simple enough, we apply our above workflow to System.Reflection, right? We if you go to the System.Reflection source, you’ll find that there isn’t much there. That’s because the project is “really” a part of System.Private.CoreLib. At this point I must recommend that you read the first few sections of runtime\docs\design\coreclr\botr\corelib.md. This is a chapter from The Book of the Runtime (BOTR). This “book” has a wealth of information, mostly informal, from .net developers gathered over years of working on the project. Needless to say, it is required reading for someone looking to learn more about the runtime. The main point is that the System.Private.CoreLib project is special because it is coupled to the native implementation, and gets special treatment from optimizers during build. The location of our System.Private.CoreLib project is runtime\src\coreclr\System.Private.CoreLib. We can build this project simililarly as before, but using our change is not as straightforward due to the coupling with native code. So we need to rebuild the CLR. The documentation at runtime\docs\workflow\testing\using-your-build.md gives a hint:

You can build just the .NET Library part of the build by doing (debug, for release add ‘release’ qualifier) (on Linux / OSX us ./build.sh)

    .\build skiptests skipnative

Note that .\build above is native\src\coreclr\build-runtime.cmd in my build - keep your eyes open while working on the project and referring to potentially dated (but still quite useful) documentation. So, to recap, in order to rebuild the CLR, we need to:

  1. runtime\.dotnet\dotnet.exe build runtime\src\coreclr\System.Private.CoreLib\System.Private.CoreLib.csproj
  2. native\src\coreclr\build-runtime.cmd [skipnative]

I indicated that [skipnative] is optional, because whether you skip it or not depends on if you’ve changed any native code. Once again, I find a powershell helper-function useful for me:

$RUNTIME_DIR = "path\to\runtime"
$DOTNET_CLI = "$RUNTIME_DIR\.dotnet\dotnet.exe"
$BUILD_RUNTIME = "$RUNTIME_DIR\src\coreclr\build-runtime.cmd"
$PRIVATE_CORELIB_CSPROJ = "$RUNTIME_DIR\src\coreclr\System.Private.CoreLib\System.Private.CoreLib.csproj"

function RebuildCLR {
    [CmdletBinding()]
    param (
        [Parameter()]
        [Switch]
        $SkipNative,

        [Parameter()]
        [Switch]
        $SkipManaged
    )
    if (-not $SkipManaged) {
        & $DOTNET_CLI build $PRIVATE_CORELIB_CSPROJ
    }

    if ($SkipNative) {
        & $BUILD_RUNTIME -skipnative
    }
    else {
        & $BUILD_RUNTIME
    }
    Copy-Item -Force $RUNTIME_DIR\artifacts\bin\coreclr\windows.x64.Debug\IL\System.Private.CoreLib.* `
                     $RUNTIME_DIR\artifacts\bin\coreclr\windows.x64.Debug\
}

If you’re curious about the last Copy-Item, it’s because when I looked at the output of & $DOTNET_CLI build $PRIVATE_CORELIB_CSPROJ, that’s where the output went:

PS > & $DOTNET_CLI build $PRIVATE_CORELIB_CSPROJ
Microsoft (R) Build Engine version 17.2.0-preview-22175-02+058a0262c for .NET
Copyright (C) Microsoft Corporation. All rights reserved.

  Determining projects to restore...
  All projects are up-to-date for restore.
  ...
  System.Private.CoreLib -> D:\runtime\artifacts\bin\coreclr\windows.x64.Debug\IL\System.Private.CoreLib.dll

Keep your eyes open, be prepared to do some hacking.

Back to the problem at hand, we look in runtime\src\coreclr\System.Private.CoreLib\src\System\Type.CoreCLR.cs, and find that Type.GetType(string) just forwards the call to RuntimeType.GetType, which then calls RuntimeTypeHandle.GetTypeByName. We search for the definition of this method in runtime\src\coreclr\System.Private.CoreLib\src\System\RuntimeHandles.cs, and we find this:

[LibraryImport(RuntimeHelpers.QCall, EntryPoint = "RuntimeTypeHandle_GetTypeByName", StringMarshalling = StringMarshalling.Utf16)]
        private static partial void GetTypeByName(string name, [MarshalAs(UnmanagedType.Bool)] bool throwOnError, [MarshalAs(UnmanagedType.Bool)] bool ignoreCase, StackCrawlMarkHandle stackMark,
            ObjectHandleOnStack assemblyLoadContext,
            ObjectHandleOnStack type, ObjectHandleOnStack keepalive);

Looks scary. In the BOTR you will find plenty of documentation on QCall, for us it is sufficient to know that this calls a C++ function, whose definition we will have to track down to dig any further. At least we can guess it’s name (right?): EntryPoint = "RuntimeTypeHandle_GetTypeByName". We’ll find the definition of a function with that name in runtime\src\coreclr\vm\runtimehandles.cpp. At this point digging through the code to figure out what information you want to extract, and how to extract it, will mostly be a function of your C++ skills. Since I know how the story ends, we’ll skip right to runtime\src\coreclr\vm\classhash.cpp, and look at the function (comments mine):

EEClassHashEntry_t * EEClassHashTable::GetValue(LPCUTF8 pszFullyQualifiedName, PTR_VOID *pData, BOOL IsNested, LookupContext *pContext)
{
    /*
        Such macros define pre-conditions, assumptions, and indicates various state
        changes the function can make.
        See `runtime\docs\coding-guidelines\clr-code-guide.md` for more information
    */
    CONTRACTL
    {
        if (m_bCaseInsensitive) THROWS; else NOTHROW;
        if (m_bCaseInsensitive) GC_TRIGGERS; else GC_NOTRIGGER;
        if (m_bCaseInsensitive) INJECT_FAULT(COMPlusThrowOM();); else FORBID_FAULT;
        MODE_ANY;
        SUPPORTS_DAC;
    }
    CONTRACTL_END;

    _ASSERTE(m_pModule != NULL);

    /*
        null-terminated string (sz) in class optimized for 512 or less bytes of data
        SEE: runtime\src\coreclr\inc\corhlprpriv.h
    */
    CQuickBytes szNamespace;

    // Essentially a `const char *`
    LPCUTF8 pNamespace = Utf8Empty;

    // returns pointer to last `.` in the name (or NULL)
    LPCUTF8 p = ns::FindSep(pszFullyQualifiedName);

    if (p != NULL)
    {
        SIZE_T d = p - pszFullyQualifiedName;

        FAULT_NOT_FATAL();
        /*
            Modifies the buffer in szNamespace to contain the first `d`
            bytes of pszFullyQualifiedName, terminates with NULL (0),
            then returns pointer to buffer
        */
        pNamespace = szNamespace.SetStringNoThrow(pszFullyQualifiedName, d);

        if (NULL == pNamespace)
        {
            return NULL;
        }

        p++;
    }
    else
    {
        p = pszFullyQualifiedName;
    }

    /*
        The definition of EEClassHashTable is in runtime\src\coreclr\vm\classhash.h
        The description there states:

        // Hash table associated with each module that records for all types defined
        // in that module the mapping between type name and token (or TypeHandle).

        Furthermore, the EEClassHashEntry_t has the following member:

        private:
        // Either the token (if EECLASSHASH_TYPEHANDLE_DISCR), or the type handle
        // encoded as a relative pointer
        PTR_VOID    m_Data;
    */
    EEClassHashEntry_t * ret = GetValue(pNamespace, p, pData, IsNested, pContext);

    return ret;
}

I’ve determined that somewhere around here things go awry. To demonstrate printf style debugging in the native code, add #include <stdio.h> to the top of the file, and the following lines:

  + printf("[EEClassHashTable::GetValue] pszFullyQualifiedName: %hs\n", pszFullyQualifiedName);
  + printf("[EEClassHashTable::GetValue]            pNamespace: %hs\n", pNamespace);
    EEClassHashEntry_t * ret = GetValue(pNamespace, p, pData, IsNested, pContext);

    return ret;

The only thing that may be strange about this printf call is the use of %hs (at least, I had never used %hs before). There are different types of strings used in the project, and they have different format specifiers. Luckily, if you use the wrong one in a printf statement, you’ll get a helpful error message from the compiler.

After rebuilding the CLR, and rebuilding my project (and copying over the necessary artifacts), I get the following output from the unobfuscated and obfuscated programs:

UNOBFUSCATED

converterBeforeAttribute: System.ComponentModel.TypeConverter
[EEClassHashTable::GetValue] pszFullyQualifiedName: ProblematicCode.MyConverter`1
[EEClassHashTable::GetValue]            pNamespace: ProblematicCode
[EEClassHashTable::GetValue] pszFullyQualifiedName: System.String
[EEClassHashTable::GetValue]            pNamespace: System
converterAfterAttribute: ProblematicCode.MyConverter`1[[System.String, System.Private.CoreLib, Version=7.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]]
OBFUSCATED

converterBeforeAttribute: System.ComponentModel.TypeConverter
[EEClassHashTable::GetValue] pszFullyQualifiedName: _ProblematicCode.MyConverter
[EEClassHashTable::GetValue]            pNamespace: _ProblematicCode
[EEClassHashTable::GetValue] pszFullyQualifiedName: _ProblematicCode.MyConverter
[EEClassHashTable::GetValue]            pNamespace: _ProblematicCode
converterAfterAttribute: System.ComponentModel.TypeConverter
[EEClassHashTable::GetValue] pszFullyQualifiedName: _ProblematicCode.MyConverter
[EEClassHashTable::GetValue]            pNamespace: _ProblematicCode
[EEClassHashTable::GetValue] pszFullyQualifiedName: _ProblematicCode.MyConverter
[EEClassHashTable::GetValue]            pNamespace: _ProblematicCode

Hmm, do you remember when I said our obfuscator is removing the namespace associations from the metadata? Well, the above output from our command indicates that a namespace is being explicitly parsed. The guilty line is:

    LPCUTF8 p = ns::FindSep(pszFullyQualifiedName);

We’ll return to ns::FindSep in a moment.

Continuing with our investigation, we decide to add the following printf lines to runtime\src\coreclr\vm\methodtablebuilder.cpp, since this is ostensibly where our types are loaded from the metadata:

    // Builds the method table, allocates MethodDesc, handles overloaded members, attempts to compress
    // interface storage.  All dependent classes must already be resolved!
    //
    MethodTable *
    MethodTableBuilder::BuildMethodTableThrowing(/* ...
    */)
    {

    /* ... */

    LPCUTF8 className;
    LPCUTF8 nameSpace;
    if (FAILED(GetMDImport()->GetNameOfTypeDef(bmtInternal->pType->GetTypeDefToken(), &className, &nameSpace)))
    {
        className = nameSpace = "Invalid TypeDef record";
    }

  + printf("[MethodTableBuilder::BuildMethodTableThrowing] className: %hs\n", className);
  + printf("[MethodTableBuilder::BuildMethodTableThrowing] nameSpace: %hs\n", nameSpace);

We go through the whole ceremony to inspect the output from our obfuscated program, and we get the following:

[MethodTableBuilder::BuildMethodTableThrowing] className: _ProblematicCode.MyConverter
[MethodTableBuilder::BuildMethodTableThrowing] nameSpace:
...
[EEClassHashTable::GetValue] pszFullyQualifiedName: _ProblematicCode.MyConverter
[EEClassHashTable::GetValue]            pNamespace: _ProblematicCode

So we can see that when the class is loaded, it has the name: _ProblematicCode.MyConverter, and there is no namespace associated with it. Later, when the class is looked up through reflection, the “fully qualified name” is still _ProblematicCode.MyConverter, but the logic there makes the assumption that the class has a namespace due to the . in the name. Since this isn’t documented anywhere (that I could find), and ECMA-335 gives no such special meaning to the . in names, this would appear to be a bug in the runtime.

Therefore, to fix our obfuscator (it seems easier to “fix” our obfuscator than to “fix” the runtime) WRT this bug, we just need to make sure that no '.' character appears in the obfuscated name. We’ll opt to ingeniously .Replace('.','^') our obfuscated name before further processing.

After we fix our obfuscator and run it again:

[MethodTableBuilder::BuildMethodTableThrowing] className: _ProblematicCode^MyConverter
[MethodTableBuilder::BuildMethodTableThrowing] nameSpace:
converterBeforeAttribute: System.ComponentModel.TypeConverter
[EEClassHashTable::GetValue] pszFullyQualifiedName: _ProblematicCode^MyConverter
[EEClassHashTable::GetValue]            pNamespace:
converterAfterAttribute: _ProblematicCode^MyConverter[[System.String, System.Private.CoreLib, Version=7.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]]

We can claim victory. At least, if all you cared about was determining precisely what was the cause of the observed behavior, you can. And we do.

The Last Loose End

Well, why did the program work after processing the assembly with ildasm/ilasm when dealing with Bug 1? You guessed it, ilasm automatically inserted namespace information into our metadata without us knowing about it. It can be tracked down to the following code in runtime\src\coreclr\md\compiler\regmeta_emit.cpp

//*****************************************************************************
// Create and populate a new TypeDef record.
//*****************************************************************************
HRESULT RegMeta::_DefineTypeDef(/* ... */)
{
/* ... */
ulStringLen = (ULONG)(strlen(szTypeDefUTF8) + 1);
IfFailGo(qbNamespace.ReSizeNoThrow(ulStringLen));
IfFailGo(qbName.ReSizeNoThrow(ulStringLen));
bSuccess = ns::SplitPath(szTypeDefUTF8,
                        (LPUTF8)qbNamespace.Ptr(),
                        ulStringLen,
                        (LPUTF8)qbName.Ptr(),
                        ulStringLen);

Here we see the use of the ns::SplitPath function, and guess what function ns::SplitPath uses to determine where to split? ns::FindSep - the culprit from earlier.

Anyways, I guess the important takeaway here is that ilasm is not the inverse of ildasm, they can make non-trivial, irreversible modifications to the code when used in the way we’ve used them. That doesn’t mean don’t use them! Just be wary. (I’d be willing to bet that ilasm . ildasm is a projection, but it doesn’t really matter.)

Conclusion

Hopefully we’ve demonstrated that knowledge and research of internals and low-level specifications can be useful in solving problems with your software whose solutions cannot be determined through standard troubleshooting techniques. I will be satisfied if this article inspires someone to dig through some of the dotnet source code for themselves. The task is somewhat daunting – but if I can do it, so can you (just don’t expect to understand everything in one day, or perhaps even in one year).