|Excerpted from Bob Metcalfe's Reddit post today:|
On May 22, 1973 with David R. Boggs, I (Bob Metcalfe) used my IBM Selectric with its Orator ball to type up a memo to my bosses at the Xerox Palo Alto Research Center (PARC), outlining our idea for this little invention called “Ethernet”, which we later patented.
(end of excerpt)
I've made my living via Ethernet since the mid 80s when I crawled around my office installing coaxial cabling and Ethernet terminators for my Novell Netware Network. (NE1000 cards anyone?)
Things have improved a lot since then. We all owe Bob and Dave a huge debt.
|I’m getting ready to head out to Las Vegas this weekend to get the Small Tree booth all setup (SL6005) and I’m really excited.|
First off, we have a brand new version of our Titanium platform coming out called “Titanium Z”. The Z platform is AWESOME and the folks here at Small Tree (including The Duffy) are very excited to start telling people about it.
First of all, in keeping with our history of bringing really high-tech functionality (like real time video editing) down into the commodity price space, we are now bringing down Storage Virtualization.
To offer Virtualization, we had to migrate Titanium to a new OS based on FreeBSD. In doing this, we were able to pull in ZFS technology. This gives us the ability to stripe RAID sets together, migrate data around, and add new RAID sets to existing volumes without rebuilding.
We’ve also updated all the hardware, increased performance 25% and kept our same great low price model. You get more for your money.
The Titanium 4 has also been extensively improved based on customer feedback. ZFS performance is so good, we ditched the need for a RAID controller in the new T5. At the same time, we added a 5th drive (more storage, more performance) and allowed for the addition of a dual port 10GbaseT card. So now, not only is the device mobile, fast and inexpensive, it also supports direct attaching with 10Gb Ethernet! You can bring along one of our ThunderNET boxes on your shoot and have your laptop editing over 10Gb Ethernet right out in the field.
Lastly, I’ve had tons of people bugging me about SSDs and 10Gb. I demoed a super fast box at the Atlanta Cutters called “Titanium Extreme” and we showed off real time video playing to my laptop (over Dual 10Gb ports) going 1.2GBytes/sec. (not a benchmark. Real video). We’ll have this guy along as well.
So if you want to stop by and visit us and see all this cool stuff, swing down the South Lower (6005). You can’t miss us. We will have a giant round screen hanging above us with all sorts of amazing stuff flying by put together by Walter Biscardi. We’d love to see you.
|Every year as NAB approaches, the marketing once again begins. Oh the marketing....|
As NAB approaches, I'd like to take a moment to remind people in the market for storage that Gigabytes/second is not what makes video play smoothly.
Vendors with no Computer Engineers on staff will pull together monstrous conglomerations of SSDs and RAID cards, run a few benchmarks (probably four or five different ones until they find one they like) and then claim they've hit some huge number of Gigabytes per second.
Small Tree has been supporting Server based video editing longer than anyone in the market. We were supporting Avid when they used SGI's 10 years ago (and they were SGI's largest customer). We know how things work. We helped develop them.
Playing video requires a RAID configuration that can handle multiple, clocked streams. Benchmarks on the other hand, tend to use a single stream, reading sequentially as fast as they can.
What's the difference you ask? Well, in the sequential case, the RAID controller gets to use lots of tricks to avoid the hard work of seeking around disks and reordering commands. The next block to be read is probably the next block, so things like "read ahead" work wonderfully. Don't just read the next 128k, read the next 1MB! It'll all be read next anyhow. It makes it very easy for sequential benchmarks to look good. In the Supercomputing world, meaningless TeraFLOP marketing numbers were referred to as "MachoFLOPS". We knew they meant nothing when vendors could spin assembly instructions in a tight loop and claim 1.5PetaFLOPS.
Small Tree's testing and development involves looking carefully at how the Video Editing Programs themselves read so we can carefully mimic that traffic during testing. This lets us be sure our equipment doesn't rely on sequential tricks to deliver real, multi-stream performance.
So when you walk up to a vendor at NAB and they start telling you about their MachoGigabytes per second, make sure you ask them about their sustained latency numbers. Small Tree knows all about latency and we back it up, every day with our products.
|Very recently, Small Tree had the opportunity to go down to Atlanta and visit Walter Biscardi and upgrade his data center and edit suites. In conjunction with this trip, we also did a presentation on the upgrade for the Atlanta Cutters and showed off a new SSD based Titanium shared storage system we put together. This new Titanium SSD was able to move 1.2GB/sec of *realtime* video to Adobe Premiere with no dropped frames. This is faster than you can go with 8Gb Fibre Channel and the fastest realtime video I've ever seen displayed live without a net!|
The upgrade involved pulling out Walter's existing SFP+ 10Gb switch, which had a mix of Gigabit SFP modules for his suites and 10Gb SFP+ modules for his server, and replacing it with a 10GbaseT switch from Small Tree that had 4 SFP+ ports (for the server) and 24 10GbaseT ports for the new Titanium and some of his edit suites.
Before we dived right into putting in the new switch and adding the Titanium 8, we spent a lot of time talking about power. Walter didn't want to spend $1000 for an expensive UPS, but he wanted a good UPS that could handle the new load and not break the bank. We settled on an Ultra Xfinity that offered 1200W of load. This allowed for plenty of overhead for the 660W titanium and kept the loading on the UPS to well under the recommended 80%.
After installing the new switch, we moved all the cables over. One of the wonderful aspects of 10GbaseT is that we didn't have to do anything special when replacing ports that used to be Gigabit. 10GbaseT clocks down to Gigabit and even 100Mbit. So there was no trouble with legacy equipment or special adapters.
Once the switch was in, we turned to the Titanium 8. We installed it and plugged it into its new UPS and cabled it into the switch. We bonded the two 10GbaseT ports coming from the Titanium so it would load balance all the incoming clients.
Once that was done, it was time to upgrade some of the more important edit suites to 10GbaseT. What good is having all that 10Gb goodness in the lab when you can't feel the power all the way to the desktop? We upgraded both of Walter's iMac systems to 10Gb (via ThunderNET boxes) and added another 10Gb card to his fastest Mac Pro in Suite 1.
The result was a cool 300MB/sec writing from his iMac and 600MB/sec reading using the Aja System test. As I tell people, this isn't the best way to measure NAS bandwidth because applications like Final Cut and Adobe use different APIs to read their media files.
With the NAB Show approaching, I hope many of you that are planning to attend will be able to swing by Small Tree’s booth (SL6005) to learn more about this recent install directly from Walter, as he’ll be on-hand. While you’re there, feel free to ask about the SSD based Titanium shared storage solution we’re “going plaid” with.
If you’d rather not wait until NAB to learn more, contact me at modica at small-tree.com
|Storage is a tough market and customers are always willing to pay a little less to get a little less. My take away is this: In the war between Ethernet and EtherNOT based storage, such as Fibre Channel, the one that delivers the best value for the lowest price is going to win. As Warren Buffet likes to say, "In the short term, the market is a popularity contest. In the long term, it's a weighing machine." People need to buy based on value over time.|
Fibre Channel has been hamstrung for a long time by its need for custom ASICs (chips used to implement the protocol in hardware). Fibre Channel wanted to overcome all of the limitations of Ethernet and so they invented a protocol that did just that. The problem of course is that those custom ASICS are not on motherboards. You don't get FC chips built into your DELL server (unless you order a special card or riser). You don't see Apple putting FC chips on Mac Pros (even tho they sold Xsan and XRAID for so long).
What's the result? Expensive chips. It's expensive to fab them and expensive to fix them. FC stuff is expensive. Vendors may find ways to lower the entry point, but somewhere or other, either via support, licensing or upgrades, the cost will be expensive.
Ethernet certainly has ASICS as well. There are network processors, MAC (media access control chips) and PHY chips (the chips that implement the physical layer). They can be incredibly expensive. The first 10Gb cards Small Tree sold were $4770 list price! But here's the thing...a 10Gb card today is $1000 or less. The chips are everywhere and they are rapidly going onto motherboards. Ethernet is truly ubiquitous and will continue to be for server and storage technologies.
If you'd like to discuss or debate Ethernet vs EtherNOT, send me an email at firstname.lastname@example.org or hit me up on Twitter @svmodica.
|Not too long ago, I was asked to write up my predictions on storage and networking technology for the coming year. One of those predictions was the rise of new, combined file system/logical volume managers like ZFS and BtrFS. |
These file systems don’t rely on RAID cards to handle things like parity calculations. They also don’t “hide” the underlying drives from the operating system. The entire IO subsystem - drives and controllers - is available to the operating system and data is laid out across the devices as necessary for best performance.
As we’ve begun experimenting ourselves with these technologies, we’ve seen a lot of very promising results.
First and foremost, I think it’s important to note that Small Tree engineers mostly came from SGI and Cray. While working there, most of our time in support was spent “tuning.” People wouldn’t buy SGIs or Crays simply to run a file server. Invariably, they were doing something new and different like simulating a jet fighter or rendering huge 3D databases to a screen in real-time. There would always be some little tweak required to the OS to make it all work smoothly. Maybe they didn’t have enough disk buffers or disk buffer headers. Maybe they couldn’t create enough shared memory segments.
Small Tree (www.small-tree.com) has always brought this same skill set down to commodity hardware like SATA drives and RAID controllers, Ethernet networks and Intel CPUS. These days, all of this stuff has the capability to handle shared video editing, but quite often the systems aren’t tuned to support it.
I think ZFS is the next big step in moving very high-end distributed storage down into the commodity space.
Consider this: A typical RAID card is really an ASIC (Application Specific Integrated Circuit). Essentially, some really smart engineering guys write hardware code (Verilog, VHDL) and create a chip that someone “prints” for them. SGI had to do this with their special IO chips and HUB chips to build huge computers like the Columbia system. Doing this is incredibly expensive and risky. If the chip doesn’t work right in its first run, you have to respin and spend millions to do it again. It takes months.
A software based file system can be modified on the fly to quickly fix problems. It can evolve over time and integrate new OS features immediately, with little change to the underlying technology.
What excites me most about ZFS is we can now consider the idea of trading a very fast - and expensive - hardware ASIC for a distributed file system that uses more CPU cores, more PCIE lanes and more system memory to achieve similar results. To date, with only very basic tuning and system configuration changes, we’ve been able to achieve Titanium level performance using very similar hardware, but no RAID controller.
So does this mean we’re ready to roll out tomorrow without a RAID controller?
No. There’s still a lot of work to do. How does it handle fragmentation? How does it handle mixed loads (read and write)? How does it handle different codecs that might require hundreds of streams (like H.264) or huge codecs that require very fast streams (like 4K uncompressed)? We still have a lot of work to do to make sure ZFS is production ready, but our current experience is exciting and bodes well for the technology.
If you’d like to chat further about combined file system/logical volume managers, other storage/networking trends, or have questions regarding your workflow, contact email@example.com.
|Back when I was at SGI slaying dragons I had the good fortune to visit America Online. At the time, AOL was moving about 20% of the USA’s email traffic through SGI Challenge XL servers.|
This was around the time they crossed 10 million (with an M) users. That was a lot back then – there were t-shirts printed. Facebook is approaching 1 billion (with a B) users today.
As you can imagine, AOL introduced some serious issues of scale to our products. We’d never really had anyone use a Challenge XL server to handle 100,000 mail users (much less five gymnasiums full of Challenge XL servers to handle 10 million). Having so many systems together created some interesting challenges.
First off, when a customer has two systems and you have a bug that occurs once a year, the two-system customer may never see it. If they do see it, they might chalk it up to a power glitch. Your engineers may never get enough reports to fix the problem since it’s simply not reproducible in a way you can recreate.
Not so with 200 in a room. You might see that once a year glitch every day. That’s a very different prospect. Manufacturing will never see that in quality control running systems one at a time. It can only be seen with hundreds of systems together.
In AOL’s case, we had the “Twilight Hang” (and no, there were no vampires that sparkled). A machine would simply “stop.” There was no core dump. It could not be forced down and there would be no error messages. The machine was simply frozen in a twilight state. This is the worst possible situation because engineers and support personnel cannot gather data or evidence to fix the problem. There’s no way to get a fingerprint to link the problem to other known issues.
SGI mustered a very strong team of people (including me) to go onsite with a special bus analyzer to watch one of the machines that seemed to hit the problem more than the others. I was there for three weeks. In fact, my fiancé’ actually flew out on the last of the three weeks because it was her birthday and I was not scheduled to be gone that long.
I can recall one highlight from this trip was me sitting in a room with some of the onsite SGI and AOL people having a conference call with SGI engineering and SGI field office people. During this call, the engineering manager was explaining the theory that the /dev/poll device might be getting “stuck” because of a bug with the poll-lock. Evidently, the poll-lock might get locked and never “unlocked,” which would cause the machine to hang. I had to ask, “Carl, how many poll locks does it take to hang a system?” There was dead silence. I came to find out that the other SGI field people on the phone had hit mute and were rolling on the floor laughing. (Thanks guys). The Corporate SGI people were not amused.
Anyhow, the ultimate cause of the problem was secondary cache corruption. Irix 5.3 was not detecting cache errors correctly, and when it did it would corrupt the result every other time. Ultimately, they completely disabled secondary cache correction and to this day, you Irix users will notice a tuning variable called “r4k_corruption.” You have to turn that on to allow the machine to attempt to correct those errors (even at the risk of corrupting memory). The ultimate solution for AOL was to upgrade to R10k processors that “correctly” corrected secondary cache errors every time.
|I did some testing with Strawberry from Flavoursys last week. Flavoursys provides software for project sharing and project management for video post-production. I’m happy to report that this is one of the most exciting new products I’ve had a chance to work with in a long time.|
What is it?
Strawberry gives you the ability to share Avid projects and media from shared storage without the usual indexing issues or potential corruption problems. It provides for user and group access, metadata based search and safe read only access to projects and media that others are working on.
How does it do it?
It’s pretty ingenious. Strawberry is a client system that sits on your network. You permanently mount your shared storage to the Strawberry system so it has full administrative access. It will work all its magic on your server via this mount.
It will create project and media sub directories for your workstations. In my very simple setup, they were edit_1 for media and edit_1p for projects. There were also directories for edit_2, edit_3 and so on.
Users will access the Strawberry server via a web browser or built in app. They login with a name the administrator provides and are then able to create and open projects. When a project is created and opened, the appropriate project files and resources are created on the “main” storage and links are then generated in the users workstation project and media directories. Those links remain as long as the project remains open in Strawberry. When the project is closed, those links are removed.
Here's a screen snap of the blank Strawberry startup window:
And here's another shot of what it looks like to create a new project (it allows you to enter a rudimentary amount of metadata which is used for project naming):
What’s great about this setup is that this same user can open other users’ projects and “add” them to his own (read only) so that he can share timelines and media that might be important. All of this is managed via these dynamic links in the edit_1p and edit_1 subdirectories.
Is it simple to use?
Absolutely. As a user, I would simply login, click “create,” fill in some basic information - like my name - and select “open.”
Once this was done, I simply opened Avid Media Composer on my client and selected the External Project type and pointed to the edit_1p directory. I could see my new (empty) project and begin to work. My media would be stored in my edit_1 media directory, safely sequestered from other users so there would be no
re-indexing issues. I was able to create a number of projects and open them all read-only as add-ons to my new project.
Is it easy to setup?
This was probably the most difficult part about using Strawberry. There were several setup issues, but I know Flavoursys is currently addressing those and will have them fixed shortly. I think the product is very useful and worth the money now. However, I’ll caution potential buyers that there is the possibility they will have to have someone in to help with configuration early on.
To use Strawberry, you have to install a 1U Supermicro chassis onto your network. It boots into Red Hat, but there’s really no other instruction. I had to email Flavoursys to figure out what to do next.
The documentation is very new and pretty raw. So even when following the instructions step by step, I hit a number of puzzle points where I was unsure what to do next. This didn’t stop me from getting it set up, but it could be frustrating for editors who are expecting a plug-and-play solution.
While the system is very elegant in how it works, they brought it to market quickly by using some Windows compatibility tricks. This means there are extra layers that could potentially confuse and complicate the setup.
For example, the actual software is running on a Virtualbox virtual machine under Red Hat Linux. So the server is a Windows 2008 server running under Linux. Your users won’t be connecting to one of the Linux addresses you setup when you connected the machine. They will be connecting to a Windows Virtual address.
Additionally, the client software runs under Microsoft’s Silverlight. So all of your clients will need the Silverlight plugin to get to the Strawberry user interface. I didn’t find this to be a problem - I already had it installed on my laptop- but I had some difficulty getting Firefox to see the plugin on my test iMac (Safari worked fine).
As I understand it, Flavoursys is working on a native version of all of this to improve the performance and simplify the product.
This is a great and innovative product. It will allow for existing Avid stations to exist in broader, heterogeneous environments and reduce significantly the amount of money small shops have to pay for Avid compatible shared storage. Small Tree is now offering Strawberry with its products for this very reason.
Strawberry also offers a great upgrade path for those that are not using Avid today, but believe they will need to in the future. These shops can go out and purchase NAS based storage now, knowing that down the road they can add Avid sharing capabilities without doing a forklift upgrade.
About the only caveat I would put in place on a Strawberry purchase would be that you’ll need someone with strong sysadmin skills for the setup. There are a number technical steps and concepts (like virtual machines and NFS vs Samba mounts) that you’ll want configured by a Pro. Once that work is completed, it should be smooth sailing.
For more information on workflow solutions, visit www.small-tree.com or contact firstname.lastname@example.org.
|I started in the computer industry around 1988 as a computer engineering co-op student with a small company called Herstal Automation. Herstal built memory boards for very old HP 1000 A600 and A900 computers. These computers were some of the very first real-time computers ever made. Auto companies and the medical industry used these to monitor real-time processes like engine performance or patient vital signs.|
To give you some idea how old these were, the machine had toggle switches on the front so you could hand enter machine code instructions one at a time. This was a good way to enter tiny little test programs or force the machine to boot when you were stuck.
Typically, I would build up memory boards and boot the machine to test. Sometimes, a board would fail. When that happened, I would use the front panel toggle switches to put in a small assembly instruction program that would write all 1’s into memory. Then I would dump the memory and see if there were any single zeros (bad data line) or entire words that were 0 (bad address line). I could almost zero in on the exact pin or chip that was failing and replace it. I could unsolder a bad chip and replace it so cleanly that you couldn’t tell it wasn’t machine soldered.
One day, I recall a board that was behaving very strangely. I couldn’t seem to get it to power on. There simply were no lights.
I pulled the board out and did a visual inspection. This is tricky because if you look at a pattern (like pins on a board) the eye will see a clean pattern. It’s very easy to miss a bent pin. Our brains fool us into “seeing” the pin even when it’s not there. One has to intentionally look at each and every pin.
Even with all that inspection, I could not see a problem. Finally, I pulled out my voltmeter and started looking for a bad trace. Maybe one of the caps or resistors on the board was simply bad and was not passing current through.
I touched the power pin and the power lines on the downstream chips and could not read a connection. Hmmm… I started working backwards, closer to the pin, but still no connection. Eventually, I had both leads on the pin. Still no connection! The gold pin on the edge of the board was not conducting electricity.
I have to admit, I was not a good physics student. I hated physics. However I did very well in electronics and I know that gold conducts electricity. In fact, gold is great at conducting electricity. So I did the scratch test. I took a tiny little screwdriver and gently slid it against the pin (scratching gold pins on these boards was a definite no no). A thin film of plastic bubbled up.
Then I knew what was wrong.
In those days, electronic boards were assembled by hand by installing parts and bending the pins down to hold them in place. The pins were then nipped off and the boards were sent to a “wave solder” facility. These places would pass the board over a flowing “wave” of molten solder and all of the pins and pads would be soldered at once in a very uniform and reliable manner.
There was one problem with wave soldering in that if you had gold connector pins on your card (like a PCI card), the solder would stick to the gold and ruin it.
To avoid this, wave solder companies would use a special water-soluble tape on these edge connectors. After the wave solder was complete, the boards were put into an industrial dishwasher and the tape would dissolve. (This would also clean the boards and make them look nice and new).
So the reason I had this layer of plastic was that our wave solder company’s dishwasher was broken. It was not heating the water enough to completely dissolve the protective tape layer. I confirmed with a quick phone call and used a tooth brush and some hot water to fix the rest of the batch.
Obviously, in this day and age, this is a pretty rare problem. Boards you buy for your edit stations are built in large quantities and Quality Control Tested on pin grids and test rigs to quickly rule out any obvious problems.
That being said, there are things you need to consider when dealing with large PCIE boards that you might have to plug into an older machine:
1. Make sure you aren’t carrying around any excess static that might zap the board. If the vendor provides one, put on an appropriate grounding strap when installing a board.
2. Be very careful when installing not to knock loose any surface mount components. In the old days, things were soldered right through the board, but today, they are only surface soldered. If you bump one of those little chips too hard, you will knock it off. Depending on what it is, your board may not work at all, or will be intermittent.
3. What carefully for interference. Often, large graphics boards have huge heat sinks or cases and fans on them. Make sure none of these devices is touching a neighboring board. This could lead to electrical shorts or overheating
4. Make sure all wires and cables are strapped or tied down! If you leave unused connectors hanging, one day they are going to end up hitting a fan or a hot component and melting.
5. The 5Volt and 12Volt supply lines inside most computers aren’t going to kill you, however, imagine what might happen if you shorted a ring or watch against a 12V line. It would get extremely hot very quickly and may even melt (while touching your skin!) So don’t take these relatively low voltage supplies lightly.
When working with complex electronic assemblies, following these steps can save you a lot of time and frustration down the road.
|Maybe you’ve heard that expression before. “Hurry up and wait.” Military guys love to quote that. It’s a reference to the military giving soldiers very important things to do, but then having them sit around idly because the people they are supposed to be doing them with aren’t ready.|
Small Tree’s been a Prime Military Contractor for about six years now, and as the lead investigator on most of our projects, I’m no stranger to this. I’ve waited for hours at bases (and at the Pentagon) for someone to come escort me to wherever it was I was supposed to be. You just aren’t allowed to walk into these places.
In one particular instance, I can remember the sheer terror of being on the other side of this equation. Imagine what it’s like to be the person that all the soldiers are waiting for.
In our case, Jeff Perrault and I were at a huge military base helping to test some of our new routers.
The project was simple for us. We built a little router that could connect to a couple of radios. We enabled some basic forwarding and routing, got all the security and IP addresses setup, and poof, there’s data routing between radio networks. Our device was called “Chloe” and the original is sitting on my desk right now showing her battle scars.
The previous week of testing had gone wonderfully. Everything worked and it worked all day. We were thinking about going home. Today’s plan was to line up 200 people, put them in vehicles and run the full test. They were using the same radios, the same routers and the same vehicles, just adding more people to the network. What could go wrong?
Cut to an earlier meeting. In this meeting, we were told of a previous day’s “Vehicle Summit” meeting where it was decided to rewire the vehicles. They no longer wanted to use USB. USB was unreliable. They wanted to use Ethernet. The solution? Put Ethernet dongles on our router and rewire the trucks.
My cell phone rang and it was one of the guys in the lead truck. He was sitting at the end of the road. He was at the front of a column of 200 people. “The router,” he told me, “is not routing.” Jeff and I ran out of the building with our stuff and moved as quickly as we could to the front of the column. We had laptops and wires hanging out as we typed and looked carefully to figure out what happened. The column was sitting in the sun waiting….and watching us.
In this case, changing to Ethernet dongles messed up our configuration script, which really wanted all the ports to come up the same way each time. We changed the MAC addresses so they matched up and everything started working. People in the column started seeing the little dots show up on their displays.
We learned something that day. If we were going to create a router for the military that soldiers would deploy, in vehicles that could be rewired overnight, with no one around who knew how to use a laptop and serial port, it had to be zero configuration. Needless to say, our LEXII router that came out the following year did not require any user input. If you connect it, we’re going to route it whether you like it or not!
|For you old timers, you may remember a story where the USA’s largest Internet service provider went down for 19 hours. For you younger folks, you can read about it here:|
I would hardly know about this story myself except that I received a panicked phone call from the SGI office that served AOL that same day. In that phone call I was asked, “Could installing an SGI server on AOL’s network bring down the entire network?”
Hmmmm… normally, I would have said “no.” Networks should be resilient. People make mistakes on networks all the time. Sometimes they put systems on that have the same IP address. Sometimes they set their subnet or broadcast addresses incorrectly. These simple errors don’t take out buildings.
Even the most egregious error I could think of - somehow looping or routing a switch back to itself - shouldn’t take out the entire network. It might hang a “dumb” switch, but AOL used expensive switches with Spanning Tree Protocol that would prevent such loops. So even if the onsite people had made the very improbable mistake of making the SGI a router and somehow sticking two of its ports onto the same switch, I could not see AOL - as in, the entire company - going offline.
I got off the phone and something started to nag at me. I remembered a case with Chrysler the year before where they had deployed some SGI workstations on their CAD network. When they turned the SGI systems on, the IBM systems would drop off the network. The upshot was that SGI systems were a lot more aggressive when sending packets and we could easily keep the IBM systems from “getting a word in edgewise.”
Could this be it? Did AOL set up a system somewhere that was handling all their DNS or something and we were forcing it off the network?
This is where politics comes in. If we shut off the SGI system and the network “magically” came back, then what? At best, AOL would have been extremely leery about letting SGI add any more servers. At worst, the headline the following day would have read, “SGI takes out entire AOL network!” Dumb luck and coincidence might have put SGI on the front page in a very unfavorable light.
Ultimately, before we could gather any traces, AOL figured it out. The complete solution is explained here: http://news.cnet.com/AOL-mystery-explained/2100-1023_3-220635.html.
As it was explained to me, a redundant router was put in place alongside the existing router to handle AOL’s network traffic. That “new” router had an empty routing table. He decided to push his routing table down to all the sub-routers on AOL’s network and essentially erased their entire distributed routing table. As I recall, admins were logging into routers and manually entering routes to allow different floors of the building to reattach so they could get to other routers and fix them until they finally got enough connectivity to get back to the main routing tables and recover everything and push it all back down.
That was a very bad day for those guys, but it was no picnic for me either. ☺
|When I used to work at SGI, I would often wonder what "C" level officers did. I once got to ask Ed McCracken what he spent most of his time doing day-to-day. At the time, he was CEO of SGI.|
His answer was that he was currently spending a lot of time talking to Congressmen trying to convince them to stop propping up Cray as a national asset. In hindsight, perhaps buying Cray was not the best idea.
As the Chief Technical Officer of Small Tree, which is a much smaller company, I have to wear a lot more hats. I thought I might include a list of the things I've been up to over the last month.
Deer hunting (actually, just watching this year)
Evaluating Titanium follow on chassis designs
Helping select next generation Software Defined Radio development platforms for the Army
Working on Adobe performance issues
Evaluating a new Avid sharing product (that works great!) called Strawberry
Evaluating a new Digital Asset Manager (that also works great) called Axle
Discussing our new high performance iSCSI products with partners
Fixing the phone system
Testing Thursby's Dave software with Avid
Helping customers with Small Tree products
Running barefoot (I run barefoot and in Vibrams.... a lot)
Working on a new voice router design for the US Rangers
Helping my kids with math homework
Processing firewood for the winter
Breaking up the recyclable cardboard boxes
Writing up an NAB presentation proposal
Prepping for a visit from the Soldier Warrior team of the US Army
Small Tree Board of Directors meeting
There’s never a dull moment.
|Not the Star Wars kind tho... |
Back when cell phones were new, a number of vendors had "clone" problems. People were cloning phone serial numbers so they could get free cell service.
To combat this problem, the cellular companies built up "Clone Detector" systems. These were massive database servers that had to be extremely fast. They would monitor all in process calls looking for two that had the same serial number. If they found a match, that phone was cloned and both were taken out of service.
SGI's systems were uniquely qualified to handle this work. The company had some stellar Oracle and Sybase numbers and offered these vendors a 10X speed up in clone detection.
The phone call came in from Florida during the test phase of the new system. The sysadmin called me up and told me that when she dumped a 25% load on the system, it slowed down very quickly. If she put a full load on the system, it stopped.
This was puzzling. I'm not a database expert, so I spent time looking at the normal performance metrics. How busy are the CPUs? Not very. How busy are the (massive) RAID arrays? Not very. How much memory is in use? Not much. Nothing was adding up.
I started watching the machine’s disk activity during the 25% load. I noticed one disk was very busy, but it was not the RAID and shouldn't have been slowing the machine down. I asked the sysadmin about it. She said it was the disk with her home directory on it and it shouldn't be interfering with the machine’s database performance. That answer nagged at me, but she was right. If the database wasn't touching the disk, why should it matter? But how come it was so busy? There was a queue of pending IOs for Pete's sake! Was she downloading files or something?
I asked her if I could take a look at the index files. Index files are used by a database to keep track of where stuff is. Imagine a large address book. I wanted to see if the index files were corrupted or "strange" in any way. I thought maybe I could audit accesses to the index files and spot some rogue process or a corrupt file.
What I found were soft links instead of "real" files. For you Windows people, they are like Short Cuts. On a Mac you might call them Aliases. She had the index files "elsewhere" and had these aliases in place to point to them. She told me "Yeah. I do this on purpose so I can keep a close eye on the index files. I keep them.... in... my... home... oh!"
So her ginormous SGI system with hundreds of CPUs and monstrous RAIDs was twiddling its thumbs waiting for her poor home directory disk to respond to the millions and millions of index lookups it could generate a second. It fought heroically, but alas, could not keep up.
Some quick copies to put the index files where they belonged and we had one smoking clone detector system.
|Many years ago when I was a "smoke jumper" support guy for SGI, I got to see some of the strangest problems on the planet. Mind you, these were not "normal" problems that you and I might have at home. These were systems that were already bleeding edge and being pushed to the max doing odd things in odd places. Further, before I ever saw the problem, lots of guys had already had a shot. So reinstalling, rebooting, looking at the logs, etc., had all been tried. |
One of my favorite cases was a large Challenge XL at a printing plant. It was a large fileserver and was used for storing tons and tons of print files. These files were printed out, boxed and shipped out on their raised dock.
Each night, the machine would panic. The panics would happen in the evening. The machine was not heavily loaded, but the second shift was getting pissed. They were losing work and losing time. The panics were all over the place - Memory, CPU, IO boards. By this time, SGI had replaced everything but the backplane and nothing had even touched the problem. The panics continued.
Finally, in desperation, we sent a guy onsite. He would sit there with the machine until the witching hour to see what was going on. Maybe a floor cleaner was hitting the machine or there were brown outs going on. We felt that if we had eyes and ears nearby, it would become obvious.
Around 8pm that night, after the first shift was gone and things were quiet, our SSE got tired of sitting in the computer room and walked over to the dock. He was a smoker and he wanted to get one more in before the long night ahead. The sun was going down, making for a nice sunset as he stood out there under the glow of the bug zapper (this happened in the south where the bugs can be nasty).
As he watched, a fairly large moth came flitting along and orbited the bug zapper a few times before *BZZZZZZT* he ceased to exist in a dazzling light display. It was at that moment when the Sys Admin (who was keeping an eye on the machine during our SSE's smoke break) yelled over to him "HEY! The machine just went down again".
Yes folks, the bug zapper was sharing a circuit with the SGI machine. One large insect was enough to sag the circuit long enough to take the machine right down. Go figure.
|Ever since Apple blew up the happy world of FCP 7, we've been running into more and more people moving to Adobe and Avid.|
Adobe's been pretty good. I like them a lot and their support guys (I'm talking to you Bruce) have been awesome.
Avid on the other hand, is tough. Shared spaces cause reindexing and external projects won't save natively to shares spaces. Our customers have mostly worked around this by storing projects locally, using multiple external volumes for media, or using AMA volumes.
I finally had a chance to explore this External project save issue in great detail today.
It turns out there's nothing specific Avid's doing that would prevent them from saving a project externally. They stat the file a few times and give up. It appears as if they are simply not allowing saves to shared protocol volumes (samba and afp).
My solution was simple: I created a sparse disk image on the shared storage, then mounted it locally.
This worked great. I could point my external project to it and it would save correctly. I could link in my AMA files and use them and I can trust that OS X isn't going to let anyone else mount that volume while I'm using it! When I'm done, I exit and unmount the volume. Now anyone else on the network can mount it and use my project (you can even have multiple users by doing read only mounts with hdiutil)
Further, this method can be adapted to just about any granularity you'd like.
For example, if your users hate the idea of creating disk images for every project, just create one very large (up to 2TB) disk image for their entire project library. You can automount that and just let them use it all the time. You could also have per project, per user or per customer images as well.
Let's face it. Storage is expensive and we all know what disks and motherboards cost. No one wants to pay three or four times what this stuff costs for a vendor specific feature. Hopefully, this trick makes it easier for you to integrate Avid workflows into your shop if the need arises.
|Just about everyone can have a free web page. You get them free when you open cloud accounts or purchase internet service. This has lead to a proliferation of cat pictures on the Internet. |
Back in the 90s, when it cost a little more to get on the Internet, the idea of personal web pages was just beginning. One very large ISP (Internet Service Provider) that used SGI systems wanted to sell personal websites. They felt SGI's Challenge S system was the perfect solution. They would line up hundreds of these systems, and each system could handle several sites. SGI did indeed set several website access records for handling the website for "Showgirls,” which, as you can imagine, had a racy website.
Fast forward a few months and there are 200 systems lined up in racks handling personal web pages. Then I start getting phone calls.
"Hey Steve. These guys are filing cases about two or three times a week to get memory replaced. We're getting parity errors that cause panics about two or three times a week."
I fly out and start looking carefully at the machines. The customer had decided to purchase third party memory (to save money) so they could max out the memory in each system. Each machine had 256MB of RAM, which was a lot at the time. This was parity memory, which means that each 8 bits has a parity bit that is used like a cheap "double check" to make sure the value stored is correct. The parity bit is flipped to a 1 or 0 so that each 8 bits always has an even number of 1s in it. If the system sees an odd number of 1s, it knows there's a memory error.
I looked at each slot. I looked at ambient temperature. I made sure the machines were ventilated properly (including making the customer cover all the floppy disk holes since they did not have floppies installed, but had neglected to install the dummy bezel). No change. Parity errors continued and clearly there was an issue.
Going back to the memory vendor and the specs on the chips, we started doing the math.
The vendor claimed that due to environmental issues (space radiation etc) one should expect a single bit parity error about once every 2000 hours of uptime for each 32MB of memory. Half of these errors should be "recoverable" (i.e., the data is being read and can be read again just to be sure), but the other half will lead to a panic. They do not mean the memory is broken, but the errors should be rare.
So let's do the math: 256MB/machine (so that's 8X 32MB).
Hours of uptime? (These machines are always up): 8760hours
How many total parity errors: 35 per system, per year, with half of them being "fatal." So, that’s 17 panics per system per year. They had 200 systems. That's 3400 panics a year in that group of systems or roughly 10 per week?!
Consider this when you start to scale up your IT systems. How many machines do you have to put in a room together before "once a year" activity becomes "once a day?”
|You've all read that 10GbaseT is on the way. It's true. Very soon, you will be able to plug standard RJ45 connectors (just like on your Mac Book Pro) into your 10Gb Ethernet cards and switches. You'll be able to run CAT6A cable 100m (assuming the runs are clean runs) and have tons and tons of bandwidth between servers and clients. Who needs Fibre Channel anymore?!|
But with the widespread migration to 10Gb, you may have a plumbing problem my friend.
Many years ago, I had the privilege of supporting three of the large animation studios in LA that were trying to use their new RAID5 arrays and run OC-3 and OC-12 right to their desktops. These two ATM standards were capable of 155Mbits and 622Mbits, respectively (this was before the days of Gigabit Ethernet). Everyone expected nirvana.
They didn't get nirvana. In fact, they found out right away that three clients ingesting media could very quickly "hang" their server. Within about 30 minutes it would slow to a crawl and sit there. They could not shut it down. Shutdown would hang. What was really happening? The machine had used all of its RAM collecting data and was unable to flush it quickly enough to their RAID. The machine was out of IO buffers and almost completely out of kernel memory. The "hang" was simply the machine doing everything it could to finish flushing all this unwritten data. We had to wait (and wait and wait).
Further, we discovered that with only three clients we could quickly start generating dropped packets. ATM had no flow control and so too many packets at once would result in dropped packets. Since the clients were very fast relative to the server, it didn't take more than a few to overwhelm it.
Similarly, as we all start to salivate over 10Gb to our Mac Books, iMacs and refrigerators, we should consider how we're going to deal with this massive plumbing problem.
First, you *will* need some form of back pressure. The server must be able to pause clients (and vice versa) or these new 300MB/sec flows are going to overwhelm all sorts of resources on the destination system.
Second, just because the network got faster, doesn't mean the disks did. In fact, now your users will have ample opportunity to do simple things like "drag and drop copies" that will use up a great deal of the resources on the server. A simple file copy over 10Gb at 300MB/sec bidirectional could overwhelm the real-time capabilities of a normal RAID. The solution lies in faster raids, SSDs and perhaps even 40Gb FCOE raids for the servers. (That's right, 40Gb FCOE raids)
So as you consider your 10Gb infrastructure upgrades, make sure you're working with an experienced vendor that knows about the pitfalls of "plumbing problems" and gets you setup with something that will work reliably and efficiently.