22 Jan

NASA’s Space Shuttle – From Top to Bottom (Infographic)

I found this pretty cool Infographic below about the Space Shuttle (via Space.com).

I like to remember and share the special time I spent working in the Space Shuttle program. It was an awesome program, people, and manned-space transportation system. One of the coolest areas of the whole system is the Launch Pad, and the little room called the White Room; scroll down towards the middle of the Infographic and you will see where it is located. Back in STS-60 I had the chance to spend time up there in the White Room (Launch Pad 39A) while Discovery was being prepared for the following day’s launch — a cool February morning.

A good summary of my time in the space program is documented here: Looking back at the Space Shuttle Program. And a short video that I took of the last Space Shuttle launch is here: Launch of STS-135 Atlantis (final mission of the Space Shuttle); you can see/listen and feel everyone’s excitement. Some of my last contributions to the Space Shuttle included the I/O profile design in support of GPS (which displaced TACANs) and preliminary work towards the very cool Glass Cockpit, photographed below.

My brother (who still works at NASA) was chief engineer for the heat tiles (thermal protection system or TPS). We both are Silver Snoopy recipients for our contributions to the manned space program; I am not sure how many brothers are there who have received a Silver Snoopy and/or who have worked together on the same space mission (I bet very few!).

Awesome times, and great memories… there is nothing like the space program (private or not).

Godspeed to the next generation of Astronauts, and to the private and government-funded space programs. And I may sound biased here, but I cannot wait to see Blue Origin (which I almost joined back in 2005) and its New Shepard make it to space.


A graphical representative of NASA’s space shuttle.

Source Space.com: All about our solar system, outer space and exploration

ceo

13 Jul

Launch of STS-135 Atlantis (final mission of the Space Shuttle)

Video of the last Space Shuttle launch (Atlantis , STS-135) as I saw it on July 8th, 2011…

(thanks to James Daniels for the video editing/rotation effects)

This is from 7 miles way. If you listen carefully, you can hear the double-boom as the vehicle breaks the sound barrier.

This is the last time my code gets to fly to space, so it was pretty special for me to go see this launch happening…

Related to this see Looking back at the Space Shuttle Program.
ceo

06 Jul

Looking back at the Space Shuttle Program

Tomorrow I leave to the Cape Canaveral (Kennedy Space Center) to see the last launch of the Space Shuttle, STS-135.

(I hope it doesn’t get delayed).

I am very excited about this. I was able to get some tickets at the last minute to see this launch from ‘up close’ (~7 miles away) from the KSC Visitor’s Center. Back in 1994 when I used to work on the Shuttle, I was able to see the launch of STS-60 from really up-close (~3 miles); and there is nothing like seeing a Shuttle launch.

Below is a post I made to the Illlist group on my Shuttle story, which I am posting here on my blog as well. Some aspects of it are a bit technical. I am sharing this here so that everyone who reads this understand the awesomeness of the Space Shuttle and of our Manned-space program.

Godspeed!


Looking back at the Space Shuttle Program
C. Enrique Ortiz | July 4, 2011

For months I have been looking to get tickets to see the last Shuttle mission from up-close. I was lucky to get the tickets to go see the launch of Atlantis thanks to @hugs (Jason) and the Illlist.

I mentioned to Jason that going to see this last mission of the Space Shuttle was a special event for me, and he asked me to post to the Illlist my story. It has been a very long time since I worked on the Shuttle program, so bear with me, but here it goes…

This last Shuttle mission has a lot of meaning to me because it is the last flight on which my contributions to the Space Shuttle onboard system software (SSW) will be flying on. As with many thousands of others, STS-135 and Atlantis is the last flight that will carry to space all of our contributions to the USA manned space program.

I grew up loving Space (and Astronomy). Movies like Space Odyssey 2001 changed the way I saw the future, space exploration and learned about that cool thing call computers, which later on turned out to be how I made my living. After I graduated from college my first job was in computers (software) and the space program, specifically the manned space program. Awesome. And to Houston I went. It was so exciting — the environment, the history, the astronauts, the vehicles and writing software for it.

This was back in the 1990s. I worked on the Shuttle program for ~5 years. My job was as a SW engineer on the Flight Computer OS (FCOS). I “quickly” learned the codebase, which was very complex and took me a year to master, or at least I thought I did. Then I began supporting flight missions (~20 in total) from launch to landing, and writing code in support of new CPU capabilities and/or new features, some small, some very large and all very critical.

The team I worked with was the best, and I miss them all. A bunch of great people, smart people, very passionate people. The space program was in our veins. We were all so proud of the space program and contributing to the mission’s success. There was nothing like getting the vehicle ready for flight (from our software point of view), and seeing that bird roaring up to space.

The Space Shuttle computer systems are *old*, very old. The drivers or requirements for it much came from the lessons learned from the Apollo missions; things like screen refresh rates, the redundancy and fail-operational/fail-safe requirements, the fly-by-wire and digital control and other. The SW implements a number of very important concepts that at the time, and still today, are very awesome and unique.

First, it probably still is the most complex real-time system out there. There are 5 general purpose computers (GPC). Each GPC has one processor the AP101/S and about 1MB of main memory. All ran in less than that. For storage, it relies on magnetic tape! (not sure if that got upgraded after I left, but those tapes gave us grief as it aged and I had to code to take into consideration changes in spinning start/stop & read/write times). There are 24 I/O buses for commanding/controlling different subsystems of the vehicle. Of the 24 buses, 8 are considered critical, for example, controlling critical sensors and/or controlling the main engines. Access to each subsystem is redundant via two different paths. A given GPC is assigned a String which consist of two critical I/O buses to command (one flight-forward and one flight-aft bus) and it listens to all the critical buses (to so maintain redundancy). That means a given GPC commands a set of 2 buses while it is listening and ready to take over others in case the other commanding GPC fails.

Four of the GPCs run the Primary Avionics software system (PASS) written by IBM (I was an IBMer at the time); GPC 1-4 run that exact same software image. The 5th computer runs the Backup Flight System (BFS) written by a different vendor, totally independent, given the same requirements given to IBM for the PASS. The idea of PASS vs. BFS is that no single bug should affect all of the computers. The idea is that if the primary systems fail because a common issue on that version of the software, the astronauts can engage the backup system. Note that the BFS has never been engaged during an actual mission.

(As a side note, it typically takes around one year for a new version of the software from completion to actual flight, due all the testing, then astronaut training).

The idea behind this fail-operational/fail-safe modus operandi is a common theme across the Shuttle. Everything is redundant. The system must handle failures such that a single system failure should keep the mission and its crew operational (fail-operational) and two system failures should keep the vehicle and its crew safe and able to land (fail-safe). This is the reason there are five GPCs, and four of those are primary ones and one is a separate, backup one. In addition, for this the PASS GPCs run in a Redundant Set during the critical phases of the mission (lift-off, on-orbit and re-entry/landing). In a redundant set, each of the primary GPCs, as previously mentioned, command two (of the eight) critical I/O buses at a given time, while listening to other buses. The idea is that if all computers are running the same software, and are receiving all the same inputs, then all execution should be the same (and all outputs should be the same as well). All computers in the redundant set, which again are running the same SW, sync-up at every interrupt (I/O , timer). If a computer fails to sync (not show up on time) twice in a row, it is voted out from the redundant set by the rest of the computers, and the designated bus-listener now takes command of those critical buses. The failed computer is halted as soon as possible by the astronauts. (while all this is happening, a number of audible alarms are going off). The computers also form what is called a Common Set, which can include redundant computers (in a redundant set of their own) and non-redundant computers; these sync-up every 160 ms. And example of a redundant set are when the computers are in guidance and navigation and control mode for launch, orbit or re-entry, and an example of a common set is having two computers in a redundant set in orbit, while having a 3rd computer doing system’s management dedicated to the robotic arm or the payload. (the 4th computer is in stand-by conserving power). It is uncommon for GPCs to fail to sync from a redundant set and is even more uncommon to fail-to-sync from a common set.

During my Shuttle days I was exposed to a number of great concepts that today are common. There I was first exposed to vector graphics used in the Shuttle UI/displays units. To Heads-up display (HUD) which I see it as my first exposure to “augmenting the reality”. I was exposed to deep embedded real-time programming and hard-core scheduling and redundant systems (as any manned-rated software should be) where computers can vote each other out to maintain safety. And I was also exposed to what probably was one of the first real uses of Metaprogramming and self-modifying code. Many people don’t know that the Space Shuttle OS implements self-modifying code for the purpose of “fault-tolerance” where the I/O code will at runtime overwrite itself for the purpose of bypassing faulty I/O elements and taking control of I/O buses when needed.

Back then I contributed in many ways. We were put under a lot of pressure to find “answers” to issues before we could go for launch or re-enter for landing.

One example was a ‘random’ issue that was showing up where blocks of memory were getting zeroed. It took me 3 weeks to figure that one out. Management was impatient, everyone was. But I finally nailed it when I was able to identify the issue to a *single* instruction of Assembler code. But how could this be? The answer -> microcode bug. The HW folks at first couldn’t believe it; microcode issues are almost unheard of. This was an issue related to how the Move-HalfWord (MHV) instruction behaved when the destination and source addresses overlapped, which was a ‘trick’ used to clear out memory (here 0xdeadbeef helped my find the source of the issue). Once found, code audits where done, and the code was patched and we were Go.

Another experience was when one of the computers actually failed to sync in orbit during the STS-51 mission. GPC2 was voted out of the redundant set. This particular issue was of extreme pressure as our Astronauts were in orbit and it was imperative to know if this is a problem that would affect re-entry. Because the fail-to-sync had occurred on the first or second day of a 2 week mission, we had some time to figure out this one. Using downlink data (every 160ms) and memory dumps and the knowledge of the code and the help of other experts, we all got to work. A lot of detective work. At the end, I could come up with only a one answer to the issue, which up to this date, has remained. After lots of careful analysis was able to identify the fail to sync to a single If-statement or ‘branch out’ instruction which seem has taken this particular GPC2 a different route thus didn’t show up to sync when it was supposed to. It happened at the DEU UI code (which is coded using the HAL/S programming language). It is as if the contents of the variable being tested was different on this particular computer. This was hard to prove as I could not see the actual value on the downlink, as it was loaded into registers for the actual branch-out/test. But how could that be? The Space Shuttle computers are space radiation hardened, but are susceptible to soft-errors or single-event memory upsets. In space, cosmic radiation will flip bits in memory all the time, specially when over the South Atlantic Anomaly (when entering the anomaly region, you can see the bit-flip count going up like crazy in the monitor screens during mission support). As a side note, the GPCs memory can sustain and will self-correct during memory scrubs 1-bit flip on a given word (32 bits), but 2+ bit flips will crash the computer. Back to the fail to sync story, the only part of the processor that is NOT radiation hardened are the registers themselves; so my only possible answer was that when the branch instruction executed, which uses registers (R2 in this case), the value of the register itself must have flipped. Everyone is like uhg? But there we were, I was, with the analysis, and dumps and explanation. That was the only explanation, everyone agreed, some had doubts, and the go ahead, and all went well.

BTW, one of the reasons memory dump analysis as above was possible is because the memory model used by FCOS is static. A very deterministic model from the rate monotonic process scheduling, to the I/O profile at any given time, and the memory layout: the I/O and process queues, the interrupt vectors, every piece of code, the patch areas, all — you knew the exact layout and location of everything. I could take a memory dump, read it (manually), and tell you exactly what was going on with that particular computer. Today I still believe that for any manned-rated software system, static-deterministic models are best; you need to be able to see, explain and saves lives by reading a memory dump. I then wrote tools in OS/2 that would take a given a dump, tell you what was going on.

Before I left the program, I helped in the analysis in preparation for GPS I/O support and the new Glass-Cockpit, but I didn’t get to work on their implementation.

And there are other stories like the above, not only from me, but from many others; amazing stories.

I enjoyed working with the Astronauts themselves, and it was very cool to meet in person John Young, first Shuttle astronaut and who walked on the moon (visited the moon twice!). And I enjoyed working with the other amazing individuals, super sharp, super smart. It was super cool listening to the astronauts as they used the code I have written, specially when it was used the first time in orbit, and all worked well; was great. And I always had a blast during mission support. Behind the big room where the flight controllers are (the one you see on the TV) there is another room, called the back-room or Mission Evaluation Room, where all the engineers for each subsystem of the vehicle are located. Typically a flight-controller consults with the back-room engineers when making decisions; in my case I was one of the engineers on anything related to the flight computer operating system.

I loved my time at the manned space program. I am very proud to have received the Silver Snoopy. While it didn’t pay$ a whole lot, it was the best job I have ever had; the people I worked with, the missions, the space program, the pride.

(A cool ‘family fact’ is that my brother also worked in the Shuttle program at the same time I did; he worked on the Thermal Protection System (the tiles). At a number of missions we both saw each other at the back-room/Mission Evaluation Room while giving mission support. I am not sure how many brothers have worked together at the same mission giving mission support; but that is pretty cool. My brother is also a recipient of the Silver Snoopy Award.)

And with this last mission of the space shuttle, an era of the manned space program ends, and a new one begins, I hope. I am thankful of such experiences and proud of the USA space program, and specifically the manned space program, what it has accomplished in its 50 years (and 30 years of Space Shuttle program).

It is of great importance that we as a Nation and as soon as possible get back into the manned space program. Otherwise we are going to lose lots of experience and expertise; gone. The manned space program requires practice and it is not like riding a bicycle which you can pick up back easily with few practice. On the manned space program if you do not practice, if you forget, people will die. Unacceptable. For the next five years we are going to be relying on our friends the Russians to get to space, but we really need to get back to it by ourselves. The research that comes out if this, the jobs, and the independence (and status) when it comes from space exploration of our nation nation depends on it.

Today I dedicate my time working on mobile and wireless technologies and software, but I always look back at my days at the Space Shuttle program, and remember…

Godspeed to the crew of STS-135 Atlantis…

Links:

/C. Enrique Ortiz (CEO)
http://about.me/cenriqueortiz

The photo below is of Hoot Gibson handing and congratulating me on the Silver Snoopy Award:
Enrique Silver Snoopy


References:
Space Shuttle Computers and Avionics


7/16/2011 STS-135 Fail-to-Sync
Source: Huffington Post

NASA declared all five of Atlantis’ primary computers to be working, pending evaluation of the latest shutdown.

Computer failures like this are extremely rare in orbit, said lead flight director Kwatsi Alibaruho. The two problems appear to be quite different, he noted. The first was caused by a bad switch throw; the second possibly by cosmic radiation.

09 Feb

Last Space Shuttle night launch and the end of the USA manned space program in 2010


Space Shuttle Endeavor lists off from laucn pad 39A

Yesterday Feb 8, 2010 was the last scheduled Space Shuttle night launch. STS-130 on board Endeavour. Night launches are spectacular. I should have made plans to go see that. My brother did.

Next there are only 5 3 missions left for the space shuttle fleet.

And not only the space shuttle program has been canceled, our moon program has been canceled as well, without any real plan behind it except ideas. In short, it seems the USA administration has cancelled/killed the manned program completely while military spending has been increased; what is wrong with that picture? Plenty.

And to keep others from saying the manned space program hasn’t been really killed, the president has thrown 2 bones at NASA: 1) increased NASA budget (but no real plan behind it) and 2) extended the operational life of the space station, except there is a problem with that.

Yes, the space station operations have been expanded to 2020, but our astronauts won’t have any capability to reach the space station, except by hitchhiking with the Russians or other. Not that I have anything against the Russians with their great minds and who have been leaders in the space program, but having no answer on how the USA will get to space except depending on other countries is just sad.

And ironically, while the USA has canceled the moon program, the Iranians are claiming they are planning to go after it…

The Russians and Chinese and now the Iranians, seem will rule the space program — at least they have the vision (and it all starts with vision). The USA manned space program seems to be more and more on the hands of the private sector-lead endeavors; maybe, I hope.

So there goes, the aerospace engineering minds of the USA, “no place” to go — and I wonder if they will end up on other countries building their national space programs? I hope some of those engineers have entrepreneurship spirit and go start their own aerospace firms.

I’m proud of my time working in the Space Shuttle program, the awesome people that I worked with, the software that I wrote that flew and still flies, the twenty-something missions that I supported at the Mission Evaluation Room, my Silver Snoopy, and each time the bird flew was/is so exciting, and together with other thousands of people helped keep the manned space program going, flying and leading the way… There is NOTHING like working in the space program.

If am very concerned for the next generation of aerospace engineers, many going to school right now, my nephew being one and the son of a good friend of mine another, and I bet are confused, asking to themselves “…what the hell; should I really continue to follow my dreams? What should I do?”. That concerns me a whole lot…

Let’s see what will happen next…

ceo

Image source: NASA via the EPOCH Times

20 Nov

Obama’s NASA Dilemma

Great article on Obama’s NASA Dilemma (The Technology Review).

“When president-elect Barack Obama takes office in January, he will be faced with a rare situation. Within his first 100 days, he will have to decide the fate of America’s space program.”

:

“As president, Obama will support the development of this vital new platform to ensure that the United States’ reliance on foreign space capabilities is limited to the minimum possible time period,” the document stated. “The [Orion] CEV will be the backbone of future missions, and is being designed with technology that is already proven and available.”

:

“In addition, investing in space exploration could help the next president deliver on promises of creating jobs in high-tech industries during the current economic crisis. “One way to look at the space program in these economic times is that it is a jobs program,” AIAA’s Bell says. “It would be bad to encourage people to go into science and technology and then get rid of one of the agencies that is the primary employer for those types of people.”

My take is that the Space Program is important, as it creates jobs, and expertise and knowledge in science, math and engineering, in operations and other areas, which are all extremely important skills for our future, and which are skills that are applicable beyond the space program itself… A good example close to the readers of this blog is the “deep space Internet” (Disruption-Tolerant Networking) test, a “new” network protocol that was recently tested:

“(DTN is a) …software protocol, which must be able to withstand delays, disruptions and disconnections in space, was designed in partnership with Vint Cerf, a vice president at Internet search giant Google.”

I’m sure some of that innovation on (wireless) network robustness and reliability will be applicable to our own wireless networks here on Earth…

ceo