Sunday, March 4, 2012

SiDoku #1

Flylogic gave a brief tutorial and a logic chip at http://www.flylogic.net/blog/?p=32 I have been messing with a standard cell based chip recently and this is one of the more complicated cells from it. In their spirit, here is a similar challenge.

Top metal (M1 and M2 visible):
Delayered:

Active area:

The two delayered images are "identical" but have different artifacts. I gave both since its what I used and it helps a little to piece things together. If you are the first to solve it and are interested I'll see if I can post at least a top metal photograph of a chip of your choosing. Get the chip to me somehow or if its something relatively common I might have it hoarded somewhere. Solving this means 1) giving a high level description of the functionality of the device and 2) A gate level schematic (ie with and gates etc where possible instead of transistors) with pins labeled to the M2 contacts

Hint: the two bridged contacts on the top metal (M2) are part of the cell and can assumed to be connected.

I'll release the solution in a week or two if no-one gets it. Some more resources to help including an inverter from the same chip: http://siliconpr0n.org/wiki/doku.php?id=cmos:start There are also some instructions on how to load these images into Inkscape at http://siliconpr0n.org/wiki/doku.php?id=tutorial:digitizing_with_inkscape

Monday, February 20, 2012

Silicon Pr0n (old) backup restored

I have restored the silicon pr0n wiki from an old backup so it should be less of a skeleton now. Find it here:
http://siliconpr0n.org/wiki/doku.php?id=start
It may be a bit rough around the edges still as a lot of things hadn't been polished up yet after the Wikispaces conversion

If you want to contribute send me a brief e-mail with your interests in IC RE and I'll create you an account.

Saturday, February 11, 2012

Tile stitch

Now that I have a microscope that can generate lots of imaging data stitching has become the bottleneck. I forget exactly how long but the large memory requirements of gigapixel (GP) sized images made it take a day or so to stitch on my desktop. I might have been fine with this except that something, I suspect enblend or one of its libraries, seems to generate a number of glitches when the images get very large. I've seen this on both 32 and 64 bit systems and should probably file a bug report... In any case, I wanted to reduce system requirements since I figured there was a way to do things better.

Recently I've been playing around with the Google maps API as an idea to use tiles instead of viewing the huge source images. I first played around and tiled a visual6502.org image, the MOS 6522 that you can find here. This is nice as people without powerful computers can view this large image at full detail. To be fair the jpg compression also significantly reduced its size although not to a point where significant quality was lost for my purposes. The tool to do this can be found here.

However, this still leaves the question of how to avoid creating the large intermediate images. After some thinking I came with the following:
  • nona (a "remapper": transforms source images to their location in the output image) will only output images that are in the cropped area. Note that it will still spend a small amount of time on each of the other images deciding if it needs to generate it
  • enblend (a "blender": resolves conflicts when two source images occupy the same area) output should only differ in areas where there's a potential conflict
  • There is no potential for conflict on areas where images are unique. In particular this is the edges and there is less conflict in the center 1/3 of my images since I have 1/3 overlap with neighbors
This allows the following algorithm.



Step 1: get an optimized .pto

You can get this from any source you want. I am using my pr0nstitch program (discussed in a previous post) which I then optimize for size and crop in Hugin. While the Hugin stage could probably be automated its at least doesn't take very long and gives me an idea of how well the optimize went before trying a full stitch.

For this example I'll show a smallish input image. When stitched with Hugin it looks like this:

As an aside, this is a MOS 6522 that I used HF to remove the passivisation and then ate out the metal. Then end result is that you can still see where the metal was (because there is still a lot of SiO2 leftover) while still seeing all of the bottom layers.

Heres a visual from Hugin of what the input looks like:

I think the color gradients are related to me using semi-polarized light on high quality but not strain free objectives (Mitutoyo plan apo 20X). On the bright side it makes the source image boundaries much more pronounced.

Step 2: pick the largest single panorama size you want to stitch


Ideally the supertile should be the largest size that enblend can fit into RAM (and is error free per the bugs I've had...). The software chooses 4X4 source image size by default (~2.5 X ~2.5 shown above) and has a command line option to customize.

Step 3: stitch the selected region


Remap (nona) and blend (enblend) to form a single large panorama image (a "supertile") that is a fraction of the entire output.


Step 4: generate tiles


Greedy generate all tiles that are "safe" following the criteria from the last bullet above. I assume that images around the full panorama are fully safe as well as any images that are more than a half image width in from the border. Add these to a closed list as other supertiles may be able to regenerate them.

In the visual the red crosshatched areas represent areas that are considered unsafe because they are too close to an area where the blending could vary across supertiles. The green boxed in area are all tiles that we can safely generate. Here are a few actual tiles from that full sized image upper left hand corner:






Step 5: repeat for other supertiles

Shift the supertile such that you can generate more tiles safely. This works out to roughly shifting it by the border width + a tile size. Only generate the tile if its not in the closed list.


The tool can be found here.


The first actual chip stitch I generated using this algorithm can be found here. There are a few stitching artifacts but I believe they are more related to bad optimization than the stitching process. I have been somewhat lazily always choosing the upper left hand image as the reference for position optimization. In several of my large stitches images get noticeably worse as you move away from this point. Additionally, there is a bug where I can lose a tile around the right and bottom. Presumably this isn't hard to fix as its probably something I need to just round up instead of down.

The performance improvement is also pretty good. I did several GP sized images and the stitch completes in about 3 hours. I haven't played with panotool's GPU mode to see if that results in any improvement. For reference my system has a 3 GHz Woodcrest dual core CPU (although I'm currently only using one) with 12 GB of RAM. I've been using a 10 GB enblend cache. On that note, I believe this algorithm could also be parallelized to one job per supertile without too much effort.

To be fair as part of this processed I also played around with caching options and such as I learned more about how the remapping and blending phase works. Things on the TODO list for next steps:
  • Fix the clipping bug
  • Start using a lens model
  • Try optimizing from a center anchor instead of a corner
  • Look into ways to improve the accuracy of the optimize step (ex: statistical deviations)
  • See what I'm losing by using jpg's over tif's / png's
Finally, my wiki Silicon Pr0n has been down for a while. Now that I have a job I decided to rent a VPS and get a domain name. The wiki is now at http://siliconpr0n.org/wiki/ This URL should now be stable regardless if the server blows up since I can always point it somewhere else. Additionally I have backups in place now. However, I'm still trying to recover some of the old data and it may have some (gaping) holes until I can get it back. I tended to post more material to that then this blog though so its a good resource to have back up.

EDIT:
For the heck of it I decided to figure out how to package this. Try it out at http://pypi.python.org/pypi/pr0ntools/1.0

Also I found that the anchor image for optimization is more important than I realized. A lot of my stitching artifacts appear to be due to my somewhat lazily choosing the upper left hand corner as the anchor which propagated a lot of errors. I'd still like to add a lens model though to see how that further improves error.

Thursday, December 15, 2011

CNC microscope mk2

This post will show my new imaging setup, what I was trying to accomplish in setting it up, and the design decisions I made to get it working.

A bit of history first. This is the first IC capable microscope I got some time on:


But it wasn't well suited for computer control and inverted microscopes are difficult to mount IC samples in. I also got ahold of a biological microscope that could view ICs using a strong halogen light from the side. It did however have room to mount these:


The concept which you can kinda see here:


The mechanics were kinda iffy and worked so-so but it did get me thinking. After I got some money and with a little help I was able to get a Unitron microscope. Looking back it wasn't a super high quality microscope (very old?) but it got me a lot of experience to figure out what I needed. When I was about to scrap it I decided to get a little more aggressive with it and try to retrofit it for CNC control. It turned into this:


Which was finally a working CNC microscope! Very crude but I guess it turned out pretty well for the money and effort I put into it.

And here is the new setup:


Which cost me about 10X as much as the previous setup but is closer to the sort of thing I wouldn't want to run into in a dark alley or fear that might become sentient. It is interesting to note that the first run of this used the motor drivers and motors that the original setup failed to really get working. I only bought the linear stages for the original setup so I suppose the second setup cost about 10X as the original as well. An interesting progression of what you get for logarithmically increasing cost. I don't plan on adding a new point to that curve anytime soon ;)


Objectives

I was looking for several things for this:
  • Mitutoyo objectives seem to be prized by many, wanted some to try out
  • DIC seems to give good results
Unfortunately both above are also very expensive. However, with some patience I managed to get both for at least a reasonable deal although neither was still cheap. I got a set of 5X, 10X, and 20X Mitutoyo objectives included with the microscope. They do have good optical quality and long working distance and so have been nice working with.

The DIC was a little trickier. I realized that the prisms aren't labeled very well and someone not very familiar with them might not know how to list them for maximum buck on eBay. So I kept an eye out for the two things that they might be attached to:
  • Objectives
  • Turrets
Objectives tend to be expensive and tend to be matched with DIC objectives meaning others find them as well. However, turrets are more of a long shot. Finally I got lucky and found this:



The whole assembly which I got for slightly more than a single prism sells for. Best of all they are nice Olympus Neo SPlan prisms as opposed to an older Neo prism. The set includes 10X, 20X, 50X, and 100X.

Unfortunately I didn't have any NIC objectives or polarizers. The real Olympus polarizers would easily cost me $300 for a pair surplus. However, I'm not sure how special the polarizers are or if you are really just paying for the mounts. So I bought a pair of polarizers for $8 shipped:

Other side:

Basically its 4 pieces:
  • Polaroid (coated I presume) glass
  • Aluminum case
  • Retaining ring (screws in to hold the glass in place), visible in first picture
  • Lose fitting outer ring, above writing in bottom picture
Unfortunately its too high as is. Fortunately I have a small rotary table which allows me to reduce its height as well as mill a cavity for it:


The section holding the lose fitting outer ring was milled out and it was discarded. This brings us to the device it will rest in. I found an old heatsink of the correct height and started shaping it a little closer:


Which was formed into this with the blank slider for comparison:


For several reasons I wanted the optic to be removable. The easiest way to do that tends to be to use a setscrew to hold it in place. Its a bad idea to put a metal setscrew directly against glass so I decided to use the aluminum case the polarizer came with to buffer the force. A setscrew holds the original polarizer case in the slider. The slider has a step in it so that the polarizer case can only be inserted from one side and holds things in place well. The optic is then inserted and the retaining ring holds it in place. It looks like this:


Which seems to have turned out pretty good. Fitting in the microscope:



And the matching polarizer is loosely sitting in the illuminator so that I can rotate it around:


I tried using a 20X Normarksi prism with a standard Neo SPlan 20X objective but it doesn't give you DIC so bit the bullet and bought a NIC rated one. I didn't get it at a super great price but wanted something to play with.

I also discovered that I had underestimated using polarized microscopy. You can get a number of really neat effects on semiconductors by using crossed polarizers. I think my cheap polarizers give a dark purple-blue color when crossed but I like blue so it turns out okay. I'm glad to have a reasonably high power lamp as this takes considerably more light than standard brightfield microscopy.

XY / linear stages hardware
The core of the robotics section are two Micro-Controle / Klinger / Newport high precision stages. They were listed as being from a high precision milling machine or something. This could be the case as while most stages are direct drive motor to shaft these have a planetary gearbox:
This increase their torque (which I really don't need) but it also makes it easier to move smaller distances. I'm not sure what the actual precision of the stage is but the aggregate step size I get out of the system is about 110 nm. So far this has been far above anything i've needed but it may become more convenient as I move to 100X imaging.
For whatever reason they use variable reluctance motors which I'm unfortunately not as familiar with as stepper motors. I tried making a simple stepper like driver for it but it couldn't hold up under any reasonable load. I read some papers and ultimately decided it was going to be less work for me to adapt some stepper motors on it.
The two drive motors are two slightly mismatched NMB brand NEMA 17 motors. I selected NMB motors since I was hoping that they would use the same connectors as my NEMA 23 Sherline motors but no dice. Even so it has been really nice that the motors have plugs that allow me to quickly disconnect the wires. They are mounted to the stages via some cheap flexible shaft couplers. I had to make some shims to get the 1/8" (0.125") Klinger motor shafts to fit into the ~0.2 NEMA 17 shaft couplers. Note that I didn't remove the motor as it contained the sun gear but rather replaced the opto-encoder disc at the end:
I machined them out of some old steel standoffs by carefully turning them on a rotary table and using a 1/16" endmill to mill out the center. To be honest I was quite surprised I didn't break the endmill. The adapters would have worked really well except that they deform when clamped and so it wasn't a good idea to try to move them around. Things still work okay but I might want to try to make some fresh shims or order some brass rod for a more proper solution.
Robotics electronics
All three motors are driven by Precision Motion Controls stepper drivers. These are fed by a "USBIO" PIC based microcontroller board. Basically I hammer on some GPIOs through USB. On my Linux machine I was able to get some MHz but the Windows machine this runs on seems to be limited to about 500 kHz. I'm pretty sure the motors can move a lot faster than this but it would take more care in software to try to get velocity and acceleration matched well enough so that things don't slip. Since Windows XP isn't exactly an RTOS I'd be surprised if that could be made super reliable but possibly well enough for what I need.
Unfortunately while the USBIO board is powered off of a 5V USB line it uses 3.3V logic. Logical 1 is defined as > 3.5V on my motor units which was causing unstable control. To solve this I added a simple buffer chip powered off of the USBIO board's VCC breakout:
The buffer chip (a CD4069 or something, I forget which exactly) doesn't actually give near 5V out but its good enough for what I need. I couldn't find any industrial style (ie with screw terminal) voltage level converters but I assume they must exist. Its not a terribly complicated circuit board, maybe I'll make some to clean this up a little. I have some perf board laying around which would also probably do the job.

Control software
The software is descended from the original control software. However, its very different in that it performs control in realtime instead of generating g-code. Its at roughly the same location as the original software, I'll probably merge it into the original's location one dir up but until then you can find it here..
I tried a few different imaging libraries and eventually settled on the Python Imaging Library (PIL). Although I had used PIL before for image processing I didn't realize it could do image capture as well. It has mostly worked well but I have gotten truncated images on a few imaging run. I think it may have losely had to do with AmScope software fighting with my software and so I added a check to refuse to start a job if the AmScope software is running.
The USB device has a serial driver so no real magic there. I did write a wrapper class but I already had that from previous projects.
After a little thought I decided I wanted a GUI to make moving around more freestyle instead of entering coordinates at the command line. Most of my GUI experience is using Qt so PyQt seemed like a good choice. Its a "programmers GUI" in the respect that its a jumble of UI components without much thought to ergonomic / careful placement.

Camera

The old setup used a point and shoot with mixed results. It was nice that it could auto-focus but it was bad that it would sometimes focus poorly and that I couldn't decided what I wanted to focus on. So I decided to look around for a real microscope camera and found a AmScope MD1800 (8.1 MP USB) at a halfway decent price.

Originally I was using the camera through the eyepiece but this had several issues:
  • Tended to move around
  • Couldn't look through very easily
So I acquired a trinocular head which is where it rests today:
The trinocular head doesn't seem to fix it in place or seal it super well so I made a few crude enhancements. Maybe there is a better adapter I should be using? First, I put a rubber o-ring around the base to stop dust from getting in. Second, I put some rubber strips between the camera and the base and then zip-tied them on. This crudely fixes the camera in place while still providing some stress relief for things moving around.

I attempted to reverse engineer the camera driver to get it working on Linux and kind of got it working except that I haven't figured out how to sync frames. I'll see if I can post some of the data dumps here in case someone has an idea as I'd love to ditch Windows.


Focusing
On the old microscope the sample rested on a kinematic mirror mount (not shown) which allowed basic sample adjustment:
However, this was on a boom so it was difficult to adjust without shaking things around. However, I was able to compensate a little as the main reason that it was mounted that way was so that I had focus control from a precision Z stage. This allowed to compensate a little but didn't solve the fundamental problem. Also, the Z axis was kinda shaky and held in place by a rubber band to keep backlash down (the spectrometer it was scrapped from had the mirror weight holding it down). So for the second revision I tried to improve both of these focusing elements which I'll describe separately.
Active focusing (z axis)
I focused (haha) on this first since its what I used on the old setup. Z axis control is via a NEMA 23 motor with a timing belt coupled to the fine focus knob:
After taking the focus knob cover's off I noticed some threaded rod was sticking out:
Its probably M6 but I don't really know for sure. I had a timing pulley on a 1/4-20 bearing. This sounds like a really bad idea at first since the thread doesn't match and it would just slip around anyway. However, neither of these is hard to get around. 1/4-20 is very similar but not similar enough to mount directly on. Fortunately I had an M6 standoff that fits well on the microscope and that the 1/4-20 screws into pretty well too. Finally, I put a serrated washer between the pulley and the spacer to fix it in place.
Most of my pulleys are designed to fit NEMA 23 / 0.25" shafts which was part of the reason why I wanted to use a NEMA 23 motor. So, mounting that drive pulley was easy. I'm trying to avoid mounting the motor on the microscope in hopes of trying to reduce vibrations and so its currently mounted to the adjacent optics table via 1" 80/20 t-slot aluminum. I have to re-adjust it every time I move the course Z axis but its not difficult to adjust. I've considered several alternatives (attach with damping to microscope, rail mounted, etc) but haven't settled on something more permanent yet, in part because I'm not using this much as described next.
Finally, this has considerably more vibration than the XY motors which makes me less enthusiastic to use it. I'm not sure if its because the micro-stepping isn't adjusted properly on the NEMA 23 or just that it tends to move around more than the smaller NEMA 17 XY motors. I slowed it down a lot more than the XY motors so that its at a reasonable level but I'm pretty sure the mechanics could be improved.

Sample leveling
I also wanted to improve the basic angular control. It would have been nice to have this under computer control but, since its a one time (per sample or sometimes per multiple sample run) setup thing, its not a high priority.



Theory
This is how our system might look to an observer:
The line at the bottom represents the focal plan para-axial with the objective. The pivot is the bearing on which the kinematic mirror mount mirror holder rests. Turning the screw at the end raises or lowers the end of the plate in a single axis and effectively controls a single angle. We are only interested in the top of the sample and not the mirror holder that the sample rests on. I assume the sample is flat such that its surface forms a focal plane. The second line represents some adjustment of the axis screw to form a second focal plane.
Note that this model assumes that the XY plane is still parallel to the imaging plane. This is a non-trivial assumption since this difference is likely to be signifigant and result in requiring active Z focusing to get a good image. However, for my purposes the most important thing was to level out samples like packaged chips that may need significant adjustment. My stages are reasonably precise but I have yet to determine if I need to try to level them out better. The easiest way to do that is probably to insert shims or play around with how tight I set it in. The bottom stage only has 1 out of 4 screws installed and is probably the largest source of imperfections. I have been considering either drilling out the original plate or creating a new one so that it can be secured more regularly.
In this system angles are relatively small, hopefully no more than a few degrees. This allows us to make a small angle approximation so that sin(x) ~= x to simplify the math. This gives a triangle:

The following ar proportional:
a / a' : (t + t') / t' : (z + z') / z'
a / a' = (z + z') / z'
We know a, a', and z and want to find z'
z' a / a' = z + z'
z'( a / a' - 1) = z
z' = z / ( a / a' - 1)
Taking a few simple cases to get a warm fuzzy feeling this is correct. The two simple cases are that the first angle was correct and that the second angle was correct.

If the first angle was correct:
a = 0
z' = z / (0 / a' - 1) = z / (0 - 1) = -z
That is we have to go back to where we were originally

If the second angle was correct:
a' = 0
z' = z / ( a / 0 - 1) = z / (inf - 1) = z / inf = 0
That is we stay where we are.

Practice
In practice this above turns out to be too cumbersome to do by hand. If I had computer controlled tilt capabilities (such as with a Newport / New Focus 8071) these formulas might be more directly useful. I tried to apply them and the time to whip out a calculator and try to turn the knobs "just right" doesn't work well enough.
However, you can observe that if you iteratively do the following:
  • Focus on a point close to the pivot, ideally the edge of the chip is on it
  • Move to a distant point on a single axis, ideally the other edge of the chip
  • Adjust the screw until in focus
  • Move back to original location. If its still sufficiently focused you are done with this axis
  • Repeat for the other axis
I use a Newport MM2 as seen above.  The sample rests on the back of the unit and the front where a mirror would usually go is against the stage.  I put a piece of paper down so that if a sample falls it doesn't fall through the stage into the abyss.  To avoid drilling and other alignment I simply clamped it down using appropriately sized spacers.  Additionally, for large samples like the 486DX seen above, I use Blue-Tak to keep them from moving on long runs.
I'm thinking of marking some guidelines on the stage to make placing chips at the pivot point easier as this is theoretically just a single pass when that's the case. If they are not at the origin it will take more passes but typically not more than 4. For a large chip such as some GPUs I've been imaging my stages are slow enough that it can take 2 minutes to reach the other side. So, for these chips its taking about 20 minutes to level. Its not so bad though since the manual interaction doesn't take very long.  (update: I did scribe some lines, was a very good idea and chip placement is much quicker now)
Ultimately this seems sufficient without any active Z control. On this setup the stages seem para-axial enough to the focal plane that I don't require any active focusing control. Since using an active Z axis still doesn't eliminate uneven focusing across an image this method is generally preferable. Currently the main advantage of the active Z control, although I'm not using it much anymore, is that it reduces setup time since it only requires collecting three or four points in a fairly small timeframe.
Next steps
I fed some images into the stitching software I was using before and they come out a lot better because of the closer to planar setup. I played briefly with constructing a more proper lens model and it would be a good idea to give that another go.
I'd like to share the chip images but I'm realizing they more or less take up obscene amounts of bandwidth and storage. I briefly played around with a Google maps style display but didn't pursue it very heavily as it may at least cut down bandwidth and relieve the burden of needing to open huge image files.
I played around with some focus stacking but wasn't terribly impressed. I need to see if I can find some more fine tuned options in panotools or look into alternative software. This is a real problem at higher resolutions (ex: 50X) on large chips where there may be many layers. If I only store the final focus stacked images I don't lose much disk space or imaging time and the pictures in theory can look a lot better.
Somewhat unrelated and on my TODO list is to finish digitizing the MOS 6522. I put a lot of work into getting a better imaging platform and I'm now getting pretty satisfied that I do. I want to focus future work on acquiring data and automating analysis. My first goal is to try to find techniques for automatically extracting the metal layer. Among image processing techniques I'm thinking of trying to use IR light since everything else should be pretty transparent to it. However, there are a large number of minor difficulties to get this to work so I may try visible techniques first. One thing in particular is that metal tends to saturate the video sensor without any polarization which I could probably use to my advantage.
I'd like to be able to rotate samples easier and install my larger and flat kinematic stage. The easiest way for me to do that is to remove the bottom illumination lamp housing / stand and replace it with my Newport rotary stage and a riser plate. I'm looking to see if I can get a precision lab jack like a Newport 281 but they look like they cost more than I want to spend. However, I've already had some commercial interest in some people wanting to use this system for wafer failure analysis imaging and so it might be worth the investment as it would allow me to image wafers a lot easier.
Finally, I'd like to improve the motor mounts so they are wiggle free. This shouldn't be too hard as I either just need to spend some time on a lathe (I don't have one but I have a Tech Shop membership) or to order some telescoping brass tubing and fit as needed. While this doesn't seem to effect imaging too much I'm sure its not ideal on the motors and probably causes friction that could effect repeatability, especially if I want to raise speeds.

Thursday, September 22, 2011

Why basic logic chips are bad for beginners

For a while I was making a big push to try to decode a basic logic chip such as a CD4001 or a 7402/74HC02 sort of thing. The justification was that they would be very simple to analyze because:
  • Feature size is large
  • Simple logic function
  • Many examples of the same chip to compare
  • Inexpensive
  • Non-CMP: can help to see layers
However, I've come to the conclusion they are actually rather difficult to analyze over larger chips for the following reasons.


Highly irregular transistor size

On modern chips transistors have to be cookie cutter sized in certain processes not to mention they are often used in standard cells. Even a more custom chip like the Intel 4004 has somewhat irregular transistors but they still tend to be pretty limited in size.

Basic logic chips have irregular transistors for several reasons. First, due to low transistor count, every transistor gets loads of attention. They will be molded and shaped almost like an analog chip instead of a digital chip


Less data to work with

Having fewer logic functions that you need to figure out also means that you have fewer to use as examples. I did mitigate this a bit by decapsulating several since decapsulation time / photography time isn't too bad for SMD versions.


Use of less common technology

This isn't necessarily bad in itself as it gives you more breadth, but it can still get in the way. For example, CD4000 uses metal gate which are interesting for historical reasons but not used on typical modern chips. 7400s are similar since most modern digital chips don't use bipolar logic.

To hammer on this point a bit more, lets move on to...


Use of power transistors

This one is really the killer since it makes the chips really different. Even microcontrollers will beef up output transistors, but nothing like you'll see on these chips. Heres something more or less like a textbook MOSFET:


Metal gate which isn't quite so common, but you have a clearly visible source gate and drain. This is what I was hoping to see and to be fair I did find two chips like this (above is from a Fairchild CD4011). Heres an example of a transistor for a Texas Instruments CD4001:

Top metal:

And delayered with HF (same chip, not the same die):


I saw some structures like this and thought they might be diodes since I figured they were forming a PN junction. However, if you read this you'll see that these are probably actually power MOSFETs and in particular Vertical Diffused MOS (VDMOS / Double Diffused MOS (DMOS)).

Its also worth pointing out the first image was taken on a Unitron N + Aus Jena (Zeiss) objectives with 3.2 MP point and shoot while the latter was taken on a Mitutoyo microscope + Mitutoyo Plan Apo objectives with an 8.1 MP AmScope eyepiece camera in case you are wondering why the latter looks so much better.


Conclusions

So what would the best chip to learn on be? The Intel 4004 is a really great example since it has so much documentation and Flylogic has provided very nice images. On the downside its not something you'd want to do yourself since a 4004 might cost you $150 or more. There are plenty of similar less expensive chips of that scale though that should have similar value.

Wednesday, September 14, 2011

Understanding the Intel 4004

I have a new microscope that yields way better images than anything I had before, but, unfortunately, I'm having some difficulty getting a camera hooked up to it nicely. In the meantime since I still have images from other sources I thought I'd take some time to go through some parts of the Intel 4004. Flylogic has a similar write-up. This hopes to got a bit deeper especially showing how you can use tools to scale this up to analyze the/a full chip.

This article will use data from a variety of sources. The first is the Intel 4004 35th anniversary page which I like since it explicitly lists material as Creative Commons Attribution-Noncommercial-Share Alike 3.0 License. They also have a composite image (this is the author of the 4004 website so I presume its under the CC license?), schematics, and other assorted information. Datasheet and user manual can be found on Intel's website. I got the masks from FPGA-netlist-tools and read them in CIF format, specifically this file. if you don't want to load this up, you can find the original here. Finally, I'm also going to use a few small excerpts from the Flylogic top metal image and the Flylogic delayered image which I think should be fine under fair use but will take down if they would rather not have them here. I thought Intel had some CC or related images (had maybe even contracted the Flylogic) job) but I can't find those now. If there are no freely available images, if someone wants to donate a 4004 I could release some under a free license of your choice (or three if you have a million: one top metal, one delayered, and one for my collection!).

According to Wikipedia, the 4004 is 10 um pMOS which I was also able to verify on the datasheet ("4004 SINGLE CHIP 4-BIT P-CHANNEL MIROPROCESSOR"). This is similar to the NMOS logic I used in my JSSim inverter write-up except with the polarities reversed. Example PMOS inverter here.

However, I started getting a little confused as I started to look through things. Lajos describes pull-ups when I would expect PMOS to have pull-downs. Looking at various circuits, I also see what appears to be pull-ups. This confused me since I was thinking something along the lines of VDD = +5V, VSS = 0V with non-TTL compatible I/O (I didn't think through if the body connection is correct polarity, probably best to ignore it):


However, this doesn't really work since people would expect TTL compatible and this doesn't mask what I'm seeing in the mask as noted above. However, after looking at the user manual I found this:

Which shows that VDD = -10V and VSS = +5V which allows it to preserve TTL-compatible I/O. It was paired with this summary diagram:


The input stage shows why they want the +5V VSS: it provides the negative bias to trigger the PMOS input at 0 V while still allowing the pull-up to supply negative power for the input = 1 case. I was told that if you think of the circuit in terms of VDD, VSS, etc rather than actual voltages it should be more or less identical to NMOS logic. However, my understanding is that the depletion resistor is wired middle to bottom instead of middle to top so to speak as above. I don't have a good enough understanding of depletion modes to really know why this is the case.

That cleared up, lets see an example. I decided to look at the upper left hand corner because it was convenient. Cross referencing with the schematic a particular signal such as TEST might have been more logical if I was following the schematic rather than focusing on the masks. Here's the metal mask with VDD and VSS highlighted as well as an upper left hand corner area that I'll look at later in orange:


And a complete version also marked, but without VSS/VDD highlight:


Lets start with a theoretical closeup. Here's a dump from my CIF rendering:

The red (okay so its pink, otherwise I had opacity related rendering issues) is poly, the green is diffusion, the blue is metal (VDD), and the black are contacts. So we have two contacts tied to VDD and one going elsewhere which forms our pull-up. Note that you might have a self aligned gate for an enhancement MOSFET, but for depletion the entire area is diffused for normally on behaviour. This is what it actually looks like (Flylogic) in metal:

If the poly had blocked the diffusion we would not have seen the diffusion squiggle inside of the poly area. If they poly and diffusion masks were reversed, we could still tell that they were layered vs blocked since we can see a change on the squiggle exit rather than a solid line. And after delayering:

The poly and diffusion look almost identical. Fortunately, with whatever microscope technique they used (DIC maybe?) they get some good contrast between the traces and the background. I can't say for sure why it discolors where they meet, maybe just minor stress due to proximity.

Okay, but one transistor is the middle of nowhere is kind of hard to follow, so lets take a look at some of the nearby pins and see how it fits in. The upper left hand corner ties in the CM RAM 0-3 pins increasing clockwise starting with CM RAM 0 and 1 on the left and 2 and 3 on top:


These pins essentially provide the chip (bank) selection outputs to the RAM chips (4002). Since you can operate the system with a single 4002 this is an optional pin. Nonetheless, its fine for demonstration purposes. Lets arbitrarily focus on CM-RAM 0. Detailed picture of its layout:


The first thing I did was label all of the transistors that looked like they were involved in this input stage:


There is an identical input stage above that shares a small bit of poly with Q4 (diffusion just visible), but this does not influence us so ignore it. I then created a schematic with transistors arranged like in the above:

This is fine for layout capture, but a little ugly to read. Lets rearrange it to something a little nicer:

And then simplify by converting transistors to resistors where appropriate:

Cross referencing against Lajo's schematic:

They are equivalent. I'm particularly happy with this because this would have taken me a long time before but I got this pretty quickly and without having to fudge anything.

Okay, but that's only part 1 where we've managed to reverse the mask into the schematic. For part 2 is we need to see if we could figure this out from just the optical image. Here's top metal:



And delayered:


Which I've just roughly cut by using a box select + copy/paste from the full size images. I could of course work with the full size images, but they are large enough that they take up a significant amount of memory.

Anyway, I then imported them into Gimp by creating a new, slightly oversized canvas (700 X 700), importing them as layers, and setting their opacity each to 50%:


You'll notice the two images aren't quite aligned. This is why I wanted the oversize canvas as it allows me to move them around. Move them around until they line up. I should mention that I am by no means a Gimp expert, so don't take anything I do as to be the best was possible.

Next I created a layer for each mask. Since its difficult to tell buried contacts from contacts just by looking, I just created a single contacts layer (there there aren't any in our example). Since metal is very clear and un-ambiguous, I started with that by turning opacity back off and hiding the delayered image.

Press B to select the path tool. Click around a metal segment to form a polygon, remembering to click on the start portion to close it. You should now have a dashed line around your metal segment:


Then press shift-V to convert the path into a selection. Then press shift-B to select the fill tool and select a color (I used blue). Finally, fill the color in by clicking the mouse on the selection and then de-select your polygon by press ctrl-shift-A:


Make sure you put it on the metal layer by toggle show/hide for the metal layer. I believe this is the technique that was used to create the masks for the visual-6800 (plus a tool that converted the masks to JSSim input). A few tips:
  • You don't have to de-select the previous path. Just start working on the next one after fill and previous will automatically de-select when you convert it to a new path
  • You don't need to close the polygon. Last and first points will be assumed to be connected
One metal is done contacts are good to do next. As we saw with the original pullup, diffusion and poly can be difficult to tell apart just by looking at them. However, we know that IC designers have specific goals, so we can use those to figure out which is which. That said, some ICs are obfuscated, but we'll ignore that for now. Poly will usually cross diffusion areas. In particular look for it jutting out to complete a diffusion cut-off. That said, in the Flylogic 4004 example poly tends to get a lot of brown discolouration while the diffusion does not.

My first cut looked like this:


A few issues:
  • Need to figure out how to make level paths in Gimp so that it looks nicer. Right now it looks more like a kindergarten or abstract art drawing
  • The bit of diffusion that forms the pullup is not connected. Cross referencing with the masks, it is suppose to jut out to the side to go to a buried contact / via stack. However in the image it looks pretty clearly to NOT do that to me as I clearly the the edge of the diffusion. Maybe I'll get some feedback to clear this up
  • I also made an error with some poly to the right of this going to a buried contact, but this is visible in the images and a double check would have caught this. I also cared less since it wasn't part of the circuit I was analysing
  • What is the stuff under the pad? Maybe diffusion, but its not in those masks. There is a matching passivation mask but that should be on top and this is on the lower layers
This is close enough to the original that I think I could have derived the same schematic although maybe with low confidence on the pullup part. The rest of the chip can be decoded be continuing this exercise.

Hopefully this served to augment the Flylogic article and not just repeat it. In any case it helped me to make sure I really understood it.