Monday, March 21, 2011

The Light Computer

The idea of computer systems based on pulses of light moving along fiber optic cables, rather than electrical pulses through conventional wiring, has been around for a number of years. I would like to add my input to it and also to describe my vision of a computer based on light moving beyond the usual binary encoding altogether. (Note-I will alternate the two global spellings of "fiber" and "fibre", and also "color" and "colour", to avoid continuous use of parenthesis).

Light has actually been gaining ground on traditional magnetic and electrical computation and communications for quite some time. The most obvious examples are fiber optic cables replacing copper wire in long distance telephone service and optical storage, first CDs then DVDs, being used to store data instead of magnetic media. In the newest generation of DVDs, blue lasers are being used because their shorter wavelength makes possible the storage of much more data in the same space, in comparision with that if a red laser were used.

The great advantage of fibre optic cable over electrical wires for communication is the lack of electrical interference. Metal telephone wires also act as antennae, picking up all kinds of electromagnetic waves, which results in random noise and static that degrades the quality of the signal. Fiber optic cable suffers no such interference. However, in the U.S. the "local loop" is still copper wire, fibre optic is used mainly in long distance service.

A great amount of effort goes into doing all that is possible to protect the flow of data from interference. Telephone wires are twisted together because it better protects against interference. Computer network cable like Unshielded Twisted Pair (UTP) is twisted for the same reason. Coaxial cable uses the outer shell as a shield against interference.

Communications cables often have grounded wires included that carry no data but help to absorb electrical interference. Parallel data cables, such as the printer cable, are limited in how long they can be because the signals on each wire will create electrical interference which may corrupt the signals on the other wires. Modems were designed to test the lines and adjust the baud rate accordingly.

Inside the computer, every electrical wire and circuit trace also acts as antennae picking up radiation given off by nearby wires. This degrades the quality of the signal and may make it unreliable. If we make the current carrying the signal stronger to better resist interference, then it will only produce more interference itself to corrupt the signals on other wires.

Designing a computer bus nowadays is a very delicate balancing act between making the signal in a given wire strong enough to resist interference, but not strong enough to interfere with the signals on other wires. The complexity of the computer bus only makes this dilemma worse.

As we know, computing is based on electrical pulses or magnetic bits which are either on or off, representing a 1 or a 0. This is called binary because it is a base-two number system. Each unit of magnetic storage because the possibility of being either a 1 or a 0 make it one bit of information.

Eight such bits are defined as a "byte". The two possibilities of a bit multiplied by itself eight times gives 256 different possibilities. This is used to encode the letters of the alphabet, numbers, punctuation and, unprinted control characters such as carriage return. Each of these is represented by one of the 256 possible numbers. This system is known as ASCII, you can read more about it on http://www.wikipedia.org/ if you like. The great thing about this binary system is that it is easily compatible with both boolean logic and the operation of transistors. This is what makes computers possible.

But, once again, so much of the design of computers and the use of signal bandwidth goes into making sure that the signal is reliable in that it has not been corrupted by electrical interference. The eighth bit in a byte is sometimes designated as a parity bit to guard against such interference. For example, if there is an even number of 1s in the other seven bits the parity bit would be set to 0. If there is an odd number of 1s in the seven bits, the parity bit would be set to 1. The parity bit technique takes up bandwidth that could otherwise be used for data transfer, but it provides some rudimentary error checking against electrical interference.

The TCP/IP packets that carry data across the internet can be requested to be resent if there is any possibility of data corruption along the way. A new development in computer buses is to create a negative copy of data, by inverting 1s and 0s, and send it along with the positive version in the theory that interference will affect the negative and positive copies equally.

The tremendous advantage of fiber optic is that we do not have to worry about any of this. With fibre optic cables carrying data as pulses of light, instead of electrical current, we can have hundreds of cables in close proximity to one another and there will not be the least interference between them. This is what makes the concept of computers based on light so promising.

If we could implement a data system using eleven fine lasers, each a different color, the computer could work with ordinary decimal numbers instead of binary. This would not only make computing far simpler, but would provide a five-fold increase in efficiency. We would use pulses of laser light representing 0 through 9 instead of electrical pulses representing 0s and 1s.

The eleventh colour would be a "filler" pulse to be used only when there were two or more consectutive pulses of the same color. This filler pulse would help to avoid confusion about how many pulses there are, in the event of attenuation or other distortion of the data. In addition, multiple filler pulses in a row could be used to indicate the end of one document and the beginning of another.

This new system need not change the existing ASCII coding, we could simply express a letter, number or, control code by it's number out of the 256 possibilities of a byte, rather than it's binary code of the eight bits in a byte. But this would make possible a new "extended ASCII" of 999 possibilities, instead of the current 256. It would also require only three bits, instead of the usual eight. The extra symbols could possibly be used to represent the most common words such as "the", "this", "that", "those", "we", etc.

There would be no 1s and 0s, as in binary. There would only be a stream of pulses of different colours with no modulation or encoding of information, such as the laser light carrying the sound of a voice in fibre optic non-digital telephone communication. All that would be necessary is to keep one color distinguishable from another and to keep them in the proper sequence. If we could do this, any attenuation in the length of the pulses would make no difference. As our technical capabilities increase, we could increase the data transfer rate by making the pulses shorter.

When you dial a telephone number, the sound pulses that you can hear have different frequencies to represent each number on the dialpad. This would be using exactly the same concept to handle the data in a computer using light.

It probably is not a good idea to try to use more than eleven colours at this point, because that would make it increasingly difficult to distinguish one pulse from another. This old binary and ASCII system is really antiquated and I think fiber optics gives us the opportunity to move beyond it. This is yet another example of how we make much technical progress while still using a system designed for past technology so that we end up technologically forward but system backward.

DATA STORAGE USING LIGHT

Computing is a very old idea, but it's progress is dependent on the technology available. Prehistoric people counted using piles of pebbles. Later, a skilled user of an abacus could quickly do arithmetical calculations. In the industrial era, Charles Babbage built the mechanical programmable computers that are considered as the beginning of computing as we know it. I have seen some of his work, and modern reconstructions of it, at the Science Museum in London.
The development of vacuum tubes opened the possibility of computing electronically. But since such tubes use a lot of power, generate a lot of heat, and have to be replaced on a regular basis, it was only when transistors and other semiconductors were developed that modern computers really became a possibility.

Thus, we can see that there has always been steady progress in the development, but this progress has been dependent on the materials and technology available at the time. This brings the question of what the next major step might be.

I think that there are some real possibilities for the future in the pairing of lasers and plastics. The structure of plastic is one of long polymers, based on carbon, which latch together to create a strong and flexible material that is highly resistant to erosion. Fuels are made of the same type of polymers, the main difference being that those in plastics are far longer so that they latch together to form a solid, rather than a liquid.

As we know, light consists of electromagnetic waves in space. Each color (colour) of light has it's specific wavelength. Red light has a long wavelength, and thus a low frequency, while blue light has a shorter wavelength and a higher frequency.

The difference between light from a laser and ordinary light is that the beam from a laser is of a sharply single wavelength and frequency (monochromatic) so that the peaks and troughs (high and low points) of the wave are "in step". This is not the case for non-laser light, which is invariably composed of a span of frequencies, which cannot be "in step" in the same way because their wavelengths vary.

This is why a laser can exert force on an object, the peaks and troughs of the light strike the object at the same instant. With ordinary light, this does not occur because the peaks and troughs are "out of step" due to the varying wavelengths of the light. Laser light can also cross vast distances of space without broadening, and dissipating, as does the ordinary light from a flashlight.

Now, back to plastics. Suppose that we could create a plastic of long, fine polymers aligned more in one direction than the perpendicular directions. You might be thinking that this would defeat the whole idea of plastics, since such a plastic could be more easily torn along the line of polymer alignment.

But what if the light from a laser could permanently imprint the wavelength of the light on the fine polymers of this plastic? If an ordinary beam of white light, which is a mix of all colours (colors), was then shone on the spot of plastic, it would have taken on the color (colour) of the laser and thus would relect this colour back.

We could refer to the plastic as a "photoplastic", because it's polymers would take on the color of whatever laser light was last applied to it. It would, of course, be required that the polymers of the plastic be considerably longer than the wavelengths of the laser light.

This photoplastic would not be useful for any type of photography, because the wide range of wavelengths of light falling on it would dissipate one another's influence on the polymers of the plastic. But it could be extremely useful for storing data.

In use of magnetic storage of data there are only two possibilities for each magnetic bit, either "off" and "on" or 1 and 0. Eight such bits are known as a "byte", and since 2 multiplied by itself eight times gives us 256 possible combinations, the ASCII coding which is the foundation of data storage is based on this.

But if we could use this photoplastic with lasers of eleven different colours, each bit would have eleven different possibilities rather than only two. Just as we can convey much more information with color images, rather than simple black and white, we can store far more data per density using this method instead of magnetic storage.

The processor of a computer processes data by using the so-called "opcodes" that are wired into it. A processor might have several hundred, or more, opcodes wired in. These opcodes are designated by using the base-sixteen hexadecimal number system, which uses the digits 0-9 and also the numbers A-F to make a total of sixteen characters.

These "hex" numbers as designators of the opcodes built into the processor are known as "machine code". Assembly Language is a step above machine code and uses simple instructions to combine these opcodes into still more operations. All of the other higher-level computer languages do the same thing, combine together the opcodes wired into the processor to accomplish the desired operations.

Until we can develop a "light processor" to work with the light storage and light transmission of data that I have described here, the actual processing will still have to be done with electrons. But it is clear to see that the use of light in computing would be the next step forward from what we have today.

Celestial Locator Grid

On thing that I have long been really interested in, and have not yet discussed on this blog system, is the possibility of extending the system of latitude and longitude, that we use to describe locations on the earth's surface, into outer space.

We have great difficulty in describing precise points in space. We make use of the background of constellations and distance from the sun, but it leaves a lot to be desired.

The standard system of degrees north and south of the celestial equator and declination are used to pinpoint astronomical objects. The expression of a remote location in terms of angular degrees has the advantage in that it is the way in which human beings look at things. The disadvantage is that it becomes less and less accurate the further away the remote location is.

Those who have read my book "The Patterns Of New Ideas" will recall that I presented a solution entitled "The Celestial Meridian", a straight line between the centers of the sun and the star Regulus, as the foundation of such a grid locator system in space. I chose Regulus because it is not only a bright star but is on the ecliptic, the line of the apparent movement of the sun across the background stars during the course of the year. The ecliptic is shaped like a sine wave due to the tilt of the earth's axis, which also produces the seasons.

Today, I would like to present another possible plan for a grid locator system in outer space which follows the same concept as the latitude/longitude system used on the earth's surface.

The trouble with trying to institute such a system is the lack of fixed reference points in space. In our Solar System, everything except the sun is in continuous relative motion. The planets are not even at fixed distances from the sun and do not orbit the sun in exactly the same lateral plane.

For example, there is a difference of about five degrees in the planes of the moon and the sun. If they were in the same plane, there would be both a lunar and a solar eclipse every month. Comets tend to orbit the sun far above and below the general plane of the planetary orbits.

Why not extend the earth's compass directions into space? North and south are already well-defined. The points in space directly above the north and south poles always remains the same. The earth's poles actually do shift, but only over the course of thousands of years.

The difficulty here is that the tilt of the earth's axis and the continuous variation in the overhead position of the sun on the earth's surface, from the Tropic of Cancer to the Tropic of Capricorn, producing the apparent sine wave in the ecliptic against the background stars, makes it impossible to define an obvious celestial east and west.

Compass directions on earth are fixed, making the latitude and longitude system possible. But these directions are tilted at 23 1/2 degrees relative to the earth's orbit around the sun. The tropics extend for 23 1/2 degrees either side of the equator. The tropics refers to the zone where the sun is directly overhead at some point during the year.

The axial tilt of 23 1/2 degrees is also reflected in the Arctic and Antarctic Circles, these define the zones on the earth's surface where the sun does not always rise and set once every 24 hours. Summer in either the northern or southern hemisphere is defined as the period where that hemisphere is pointing toward the sun due to the axial tilt and the opposite hemisphere is pointing away from the sun.

The angle between the earth's polar axis and the plane of the earth's orbit around the sun varies. It is at 23 1/2 degrees north of the equator on the first day of the northern hemisphere summer and 23 1/2 degrees south of the equator on the first day of the northern hemisphere winter. The first day of summer and winter are known as the solstices.

It is only on the first days of spring and autumn that the sun is directly overhead at the equator so that night and day are of equal length. For this reason, these two days are known as the equinoxes, meaning equal length of day and night. The equinoxes are also the only two days when the earth's polar axis is perpendicular to the plane of the earth's orbit around the sun.

Celestial north and south in our new space locator grid are very clearly defined by the points that are always overhead at the poles. Since the polar axis is perpendicular to the plane of the earth's orbit around the sun at the two equinoxes, let's define a line from the center of the sun to the center of the earth, and continuing on into space, at the vernal equinox (first day of northern hemisphere spring) as celestial east (for Easter), and an opposite line at the autumnal equinox as celestial west.

The earth's surface is two-dimensional, while space is three-dimensional. This means that we require another line to express a point in space. This can only be a line from the earth's center at summer solstice, through the center of the sun, to the earth's center at winter solstice from the point of view of either one of the hemispheres. Let's call this the solstice line, it is perpendicular to both the polar axis and the equinox line.

To avoid confusion and errors, let's refer to the two opposite directions on the solstice line simply as celestial june and celestial december, since this is when the solstices fall. Use of three coordinates in space can precisely define any point in neighborhood of the solar system. Maybe the Celestial Meridian can be used for deeper space.

The orbit of any body in the solar system can be easily described mathematically by use of such a grid system. we could easily plot relative positions of planets and other objects for any given time. The grid locator system can be centered on either the earth or the sun, or any point for that matter such as a spaceship, with a constantly varying conversion factor between the two. This system could also benefit from the 180 degree trigonometric function that I described in the posting "New Trigonometric Functions" on this blog.

Human Language Compilation

I believe that computers should make translation of documents from one language to another simple, quick and, easy.

Computers do not actually work with our letters and words. All that they understand is the alignment of magnetic bits to represent either a 1 or a 0. But if we group such bits into groups of eight, known as a byte, we have 256 possible combinations because each os the eight bits can be magnetically set to represent either a 1 or a 0, two possible combinations, and two multiplied by itself eight times is 256.

In a code system known by it's acronym of ASCII, each letter of the alphabet, lower case and caps, as well as numbers, puncuation and, control characters, are each represented by one of the 256 possible combinations that make up a byte.

In the programming of computers, special languages are developed as a link between the tasks that humans want to computer to do and the opcodes (operational codes) that are wired into the processor of the computer. The processor in a typical computer might have several hundred opcodes, representing the tasks that are wired into it.

Opcodes are used in combination to create a vast number of possible language commands. They are distinguished by hexidecimal code. This is a numbering system based on sixteen, rather than ten. It uses the digits 0 through 9, followed by the letters a through f. This base is used because a unit of four bits (a nibble) can be made into sixteen possible combinations. This numbering system is used for such things as memory addresses in the computer, as well as opcodes.

Computer programming languages fall into two broad categories, those that are interpreted and those that are compiled. Simple web scripts such as Javascript and ActiveX controls are interpreted by the browser line by line and are not compiled. BASIC (Beginners All-purpose Symbolic Instruction Code) was originally designed as an interpreted language so that programming students could write a program and watch it being run line by line.

In high-level languages that are compiled, such as C++, a special program must be written to link the language with each processor that enters the market. This is because each and every processor has it's own set of opcodes. This special program is called a compiler and it goes over the program and, in several steps, breaks it down into the opcodes of the particular processor on the computer on which it is being run.

In assembly language, which is a low-level computer language only a step above the computer's machine code, or opcodes, another type of compiler, an assembler, is used to translate the commands into the machine code that the processor can work with. Such a low-level language as assembly language is arduous to write, but it used when a very short program that can run very quickly is required.

There are computer languages which are neither compiled or interpreted. The popular Java uses a "virtual machine" on the computer to enable it to operate across all computer platforms. Java makes use of "Java Byte Code", which is a "p-code" for pre-compiled code.

What I want to ask is why can't we write a compiler for human languages? If compilers are special programs that break the commands of high level computer languages into the opcodes that are wired into the processor of the computer, then the next step should be a compiler for any written language. The compiler could scan each sentence and break it down into numeric code.

A compiler could be written to link each human language to this code so that a document in one human language could be easily translated into another, as long as a compiler had been written for both languages.

The roadblock to an accomplishment such as this is, as I pointed out in the posting "Numbered Sentences" on my progress blog, that the word is not the primary unit of human communication when we are concerned with translating one language into another. Working with letters and words is fine as long as we will remain within one language. But it is the sentence, not the word, which must be translated from one language to another. A word-for word translation usually produces little more than gibberish simply because grammar and syntax differ from one language to another.

Such a coding system does not yet exist. It would be similar in concept to ASCII, but would require us to break down every possible sentence and, after eliminating redundancies, assign each a numeric code that would be the same regardless of what human language it was in. It would be fairly simply to assign nouns and verbs a place in a language tree structure. Spell check and grammar checking software is already widely-used and this is the next logical step.

My vision for the next breakthrough in the progress of computers lies not in technology, but in how we approach language. I have written about this topic already on the progress blog. The great limitation is that we are still using the ASCII system of coding that has been in use since 1968, when available computer memory was maybe one-thousandth of what it is now.

Basically, computer storage revolves around magnetic bits. Each bit can be either a 1 or a 0, on or off, so that there are only two possible states for each bit. This means that eight such bits have 256 possible combinations, which is ideal to encode all of the alphabet, lower case and capitals, numbers, punctuation, as well as unprinted controls such as carriage return and space. This is the system that we use, reading computer memory in the groupings of eight bits that is referred to as a "byte".

Computers only deal with numbers, while we communicate mostly with words. This means that we have to create artificial languages to communicate with computers, and to instruct them what to do. There are several hundred opcodes, or basic instructions, wired into each computer processor. Machine code tells the computer what we want it to do by combining the instructions in these opcodes.

This machine code, which is expressed in a so-called hexidecimal number system consisting of the numbers 0-9 and the letters A-F, is actually the most fundamental level computer language. One step up from this is assembly language, this is expressed in simple letter instructions and works by combining machine code instructions to the processor.

We can build higher level computer languages from this, all of which work by combining the instructions of lower-level languages. Some languages, such as those for web scripting, are interpreted in that they are simply read by the browser line-by-line. Most must have a compiler written to link each computer language to each new processor that comes on the market. The great advantage of higher-level languages is that the programmer does not have to understand exactly how the processor works in order to write instructions.

I find this system to be inefficient in the extreme for modern computing, as described on the progress blog. This is another example of how we have a way of becoming technically forward, but system backward.

For one thing, with the spell-check technology available nowadays, there is no need to encode capital letters. We can shorten the bytes that will speed computers by encoding all letters in lower case and letting spellcheckers capitalize the appropriate letters at the receiving end.

For another thing, I had the idea that we could just consider all letters, numbers, punctuation and, controls as one big number. This would mean considering it as a base-256 number, instead of the base-ten system that we are used to. But this relatively simple change would greatly multiply both the storage space and the speed available by compressing any document, as described in "The Floating Base System" on this blog.

Today I would like to write more about what should definitely be the next frontier in computing, reforming the basic system of encoding.

There are three possible ways to encode written information in a computer, by letters, by words or, by sentences. The way it is done now is still by letters, which is by far the most primitive and inefficient of the three and is a reflection of the strict memory limitations of 1968. The real unit of communication is actually the sentence, as we have seen on the progress blog, in "Human Language Compilation" and "Numbered Sentences". Notice that languages must be translated from one to another by the sentence, not by the word. This is because grammar and syntax differ from language to language, and word for word translations usually produce little more than gibberish.

To encode by sentences, we could scan the dictionary for sensible combinations of words that make sentences and then eliminate redundancies, or sentences that mean the same thing. This would give us a few million sentences that are used in communication. There would also be special pointers to names, place names, and culturally specific words. This would not only make storage and transmission of information many times more efficient, but would also facilitate easy translation from one language to another because all sentences could already be pre-translated.

The user would type a sentence, and then pick the one that came up from the database that was most like the one that was typed. Each one of these would have a pre-assigned bit code, similar in concept to the present ASCII.

There is yet another approach to better integrating the ordinary language that we communicate with and computers that I have not written about yet. This approach involves words, rather than sentences, and is will be more complex and difficult than numbering sentences, but will be the ultimate in language-computer integration and is what I want to add today.

Words are actually codes, which is why we have dictionaries for words but not for numbers. A word serves to differentiate something that exists from everything else, this fits with that all-pervasive pattern that I termed "The One And The Many", as described in the posting by that name on the patterns blog.

Since we are more complex than our inanimate matter surroundings, there is not enough complexity for everything that we could conceive of to actually exist. So, words also define for us that which does exist from that which doesn't. This is why we require words, as well as numbers, only a fraction of what could exist, from our complexity perspective, actually does exist.

Words, as codes, are far more complex than numbers. Although it may not seem like it, there is a vast amount of complexity packed into each and every word. All of the complexity of the pre-agreed upon meaning is contained in a word. Words can be thought of a a kind of "higher-level" of numbers in a way similar to that of computer languages.

Numbers differ from words in that everything is basically numbers being manifested. They exist in the universe of inanimate space and matter, while words don't. Numbers are less complex than words, but are not required to differentiate that which exists from that which doesn't as words are.

We must completely understand something in order to describe it with numbers, although that is not the case with less-precise words. (In "The Progression Of Knowledge", on the progress blog, I explained how this can give us an idea of where we stand as far as how the volume of knowledge that we have now compares with all that we can possibly know).

We cannot determine the complexity of the words that we must fall back on because if we could, we could continue our description of reality with numbers and would not need the words. We know what words mean, or else they would not be useful, but we do not know how much actual complexity the word contains in it's meaning because if we did, we could express it's meaning with numbers and would no longer really need the word.

In "Outer Mathematics" on either this or the patterns and complexity blog, we saw how numbers are all that there really is. Everything is actually numbers being manifested. This means that there must be a formula for everything that exists. But because of our complexity level, we are unable to discern formulae about ourselves or things more complex than us.

We can only arrive at a formula for something that is less complex than our brains, which have to figure it out and completely understand it. To derive a formula about ourselves or things more complex than us, we would have to be "smarter than ourselves", which is impossible. We could take the communications systems of animals and break it down into numbers, but cannot do that with our own. So, we can only rely on words for such descriptions.

But if there must be a formula for everything, even if it is hidden from us by our complexity perspective, that must also include words. Out there somewhere, there must be a way to substitute a number or a formula for every word in our language. If only we could arrive at this, it would be possible to construct a very complex system of numbers and formulae that would parallel the words that we use to communicate.

If we could only accomplish this, we would have the numbers that computers can deal with. Computers could deal directly with ordinary words, at least the ones that we had incorporated into this matching structure of numbers and formulae, and these artificial computer languages would no longer be necessary. We cannot see this, at our complexity level, because we are up against our own complexity and we cannot be "smarter than ourselves".

In the universe of inanimate matter, there is only quantity. In other words, everything is really numbers but with inanimate matter these numbers and formulae that describe everything are only one-dimensional. When we deal with living things, particularly ourselves, we have to deal with quality as well as quantity.

We can differentiate between the two by describing quantity as one-dimensional and quality as multi-dimensional. Quality forms a peak, which is the intersection of at least two slopes, while quality forms a simple slope. Quality is not simply "the more, the better", but is a peak factor. This is why we are so much more complex than the surrounding inanimate reality.

I explained a simple version of structuring words like numbers in "The Root Word System" on this blog.

So, just as we did the Human Genome Project and the Sloan Sky Survey, let's get powerful supercomputers to work developing the structure that must exist to incorporate every word that we use so that each word that we use can be expressed as a number or formula in the overall structure. Computers will then be capable of dealing with ordinary human language. All that we would have to do is to tell the computer what we wanted it to do, and they would be unimaginably more useful and easy to use than they are now.

The Root Word System

I was looking up a word in the dictionary when a thought came into my mind. A dictionary is actually a symptom of how inefficient our language is.

A perfect language, if it could exist, would actually resemble the number system that we use. Our number system is far more efficient than our word system. Numbers are self-explanatory, while words are not.

You do not need a dictionary to tell you the meaning of a number, because there is really no way to define a number better than the number itself. But suppose we used a different system and the symbol for the number 357 was completely different from the number 358. You would then need to carry a number dictionary around.

Of course, such a system would be highly illogical. But, this is the way words operate. Two closely similar words can mean completely different things in the same way that two similar things may be described by totally different words. In contrast we know that two similar numbers, such as the 357 and 358, must mean close to the same thing.

Unlike numbers, words were thrown together haphazardly over the centuries. Words are organized as verbs, nouns, pronouns, adverbs and, adjectives. But reality is fundamentally numbers and is ordered like our numbering system. We could have taken great advantage of this, except that the words were already in existence.

Words are different from numbers in that only a tiny fraction of the things that could potentially exist actually do exist and need to be assigned a word. In contrast, every number is known to exist whether it is manifested or not. It would make no sense to imagine things that do not exist and assign words to them.

Humans used symbols to represent words long before the idea of alphabets came along. Letters of the alphabet were used to represent spoken sounds and, written words were grafted onto spoken words.

Then I got to wondering, what if letters had come first so that they could have been used from the beginning to represent words, using an alphabet? We could have structured words like numbers and they would be just about as self-explanatory as numbers. Words could have successive roots and, building words would be like building molecules from atoms.

For example, the word "atom" could be such a root word so that anything associated with atoms, such as the component electrons, neutrons and, protons, would be assigned words with the root "atom-". Another such root word could be "house-", and anything associated with a house would be assigned words built on this root word. Root words could be combined together, if necessary, with the most fundamental one coming first.

We could even make the system still more efficient by assigning root words by which letter of the alphabet they begin with, so that related or similar things would be assigned similar words. We would start with dividing all words into the parts of speech: verbs, nouns, etc.

The incredible efficiency that would have been possible with such a word system can be extended to sentences. The fundamantal unit of communication is really the sentence, rather than the word. But we are forced to revolve communication around the word, due to this great inefficiency. It would often be possible to crunch sentences together into a single word. This would be extremely useful digitally.

Such a root word system to make words operate in the same way as the number system would bring us almost unfathomable advantages. For one thing, it would make learning immeasurably easier. Students would need to learn only a relatively few root words. Also the more fundamantal the word was, and thus the more frequently it is likely to be used, the shorter the word would be.

Not only would the Root Word System be incredibly efficient, it would make it a simple matter to translate between languages. It would be so easy for any document to be analyzed by computer because it would essentially be a program in itself and the compiler would only have to parse the root words.

Even if you do not think that the Root Word System is a practical idea at this point, because it would involve creating a whole new human language, it could start off as a computer language.

Human Formulae

Euclidean geometry is based on an axiom that cannot be mathematically proven. On a plane with a line and a point outside the line, there is one and only one line which can be drawn through the point that will be parallel to the line.

However, this axiom is considered to be so self-evident that it really does not need to be proven. Just the fact that this system of geometry has served satisfactorily for nearly 3,000 years can be considered as proof enough of it's truth.

Today, I would like to introduce an axiom. Although the axiom cannot actually be proven, I consider it to be self-evident in it's truth. It is that any given system can be described by a mathematical formula, as long as the system is of less than infinite complexity. A formula exists to describe any system, either in it's present state or how it was formed, whichever is more compact.

Since there is a finite amount of space and matter in the universe, it must be of less than infinite complexity and thus there is a formula out there somewhere that describes the entire universe.

There is a clear rule that I can see regarding the derivation of formulae (the plural of formula is formulae). To discern a formula in some process, the observer must necessarily have more complexity in their mental processes than is present in the process being observed. In other words, the formula governing the process will only be apparent to the observer if the observer is "smarter" than the process being observed.

Another obvious requirement regarding the derivation of formulae is that the process be completely understood. It is not possible to attach a precise mathematical formula to a process unless that process is thoroughly understood. In the posting "The Progression Of Knowledge" on this blog, I explained how the difference between science and mathematics is that science concerns that which we partially understand while the realm of mathematics is that which is completely understood.

We do just fine at deriving formulae for scientific processes and the behavior (behaviour) of inanimate matter. But what about our understanding of human beings?

Humans are extremely complex. But while we are extremely complex, we are not infinitely complex. Somewhere out there is a formula that accurately describes anything to do with humans, not in the general terms of philosophy but in a precise mathematical equation.

But if we try to derive it, we run into the roadblock of the fact that we cannot be smarter than ourselves and we would have to be to derive such a formula. So while the formula must exist, because we are of less than infinite complexity, not only can we not be smarter than ourselves, we certainly cannot be smarter than everyone else combined. So, the formula remains out of reach.

But what about supercomputers, or networks of supercomputers? Or grid computing, harnessing the processors of thousands of desktop computers? What if we had such computing power working for weeks or months on deriving formulae that might be out of the reach of ordinary human minds?

This would be particularly useful in languages. The bottleneck of computer processing is the information that we have to give the computer to work with. But that would not be a factor with language analysis. I have written previously about possible ways to translate between languages on this blog and in my book "The Patterns Of New Ideas" but previous concepts all involve pre-translation of sentences.

Providing a word-for-word translation between two languages is fairly simple. But that is only the beginning of actual translation because grammar and syntax is different from one language to another. Chances are that a word-for-word translation would produce little but gibberish.

Word usage is different. In a simple example, if we wanted to know how a machine operated in English, we would ask "How does it work?". But in French, we would have to ask the equivalent of "How does it walk?".

But if computer compression software routinely examine text and graphics, looking for patterns in the data so that it can be compressed for more efficient storage and faster download, then why not examine entire languages and break down all the rules of grammar and syntax into formulae? Combine this with a simple word-for-word dictionary and it should be simple for computers to provide translation from one language to another.

Even with all that computers are doing now, there is certainly much more that they could be doing but which has not yet been thought of and derivation of human formulae is certainly among them.

There is a general agreement among physicists that all that exists is really numbers and mathematics. Everything is numbers being manifested in some way, there are no exceptions. An obvious example is the chemical elements. An element is defined by the number of protons in it's nucleus, as displayed on the periodic table.

There is a simple, yet far-reaching implication of everything really being numbers. This must mean that every system and every process in the universe could potentially be expressed as a mathematical formula. There can never be anything that does not have a formula describing it.

My thought is that the next major frontier in science is how we fit into the universe. Remember how I have described in the cosmology blog that we see and experience the universe the way we do not just because of what it is, but also because of what we are. The next step is to get a look at the universe from "outside ourselves".

The way to go about this is with the knowledge that everything that exists can be broken down into a formula. The trouble is that we are of a certain level of complexity, and we can only analyze things enough to break them down into formulae if they are less complex than we are. We are not able to understand those things that may be more complex than we are, at least not enough to derive a formula. Just as the average cat does not understand the processes of planetary formation, there must be some things about the universe that are simply beyond us.

If everything that exists can be broken down into a formula, that must mean that somewhere out there there is a formula that precisely describes and predicts our behavior (behaviour). We cannot arrive at this formula because to do this, we would have to be "smarter than ourselves", which is impossible.

We saw in "The Progression Of Knowledge", on this blog, that to describe something with mathematics it is necessary to completely understand it. Mathematics is the realm of that which we completely understand, while science is the realm of that which we partially understand. We get still more subjective, and further away from mathematics, when we describe something as "it's an art, not an exact science".

But what is mathematics, what is science, and what is an art, depends on what we could call our "complexity perspective". We have a certain level of complexity, which is much higher than that of our surrounding environment of inanimate matter but is still limited. There are actually formulae for everything, but many of these particularly for those concerning ourselves and our nature, are beyond our reach due to our complexity perspective. I have written about this previously, in "Human Formulae" on this blog, but today I want to add more to it.

These unreachable formulae, which can describe everything that there is, are what I call "outer mathematics". The inner formulae are the ones that are within our reach by observation and reasoning. Once in a while, someone has a leap of insight that takes us beyond our usual limits and leads us to a new formula such as Einstein's E = MC squared, describing the apparent amount of energy contained in matter.

Complexity is a related topic. On the patterns and complexity blog, I have described what a tremendous advantage it would be to us if we could begin quantifying complexity. This means putting an actual number on the complexity of something, not just expressing it in subjective terms such as "less complex than" or "much more complex than".

But, once again, complexity is a matter of our perspective resulting from our own complexity. Complexity is not something that we can measure with a ruler or a meter, I explained how it will require novel ways of measurement.

For example, we know that the more complex something is the more there is that can potentially go wrong with it. Thus, we could measure the complexity of the human body by going through medical journals and counting all of the things that can possibly go wrong with the body. This would not, however, give us the absolute complexity of the body but only the complexity level relative to that of the surrounding inanimate matter.

As another example, we could put a number on the relative complexity of a society in the year 1900, in comparison with today, by counting the total number of job descriptions in the society. The more complex the society, the more different jobs there would be.

What about time? Since we can now see that the passage of time is something within ourselves and our nature, as described in detail on the cosmology blog, http://www.markmeekcosmology.blogspot.com/ , there must be a formula somewhere describing how our consciousness moves along the strings of matter composing our bodies and brains at what we perceive as the speed of light. If we could just obtain this formula, we might be able to do all kinds of things with time.

It has been said that, with regard to science, the Nineteenth Century was the century of chemistry, the Twentieth Century was the century of physics and the Twenty-First Century will be the century of biology.

The processes of living things have not yet been broken down into formulae in anything like the same way as chemical processes or the forces of nature. This is simply because of the limitations of our complexity perspective, we are living things ourselves and we cannot be "smarter than ourselves" so that we can derive the formulae that define and describe us.

It seems clear to me that the next frontier in science is to put the ever-more powerful supercomputers to work to give us a view of the universe, and how we are a part of it, that we cannot see because of our complexity perspective in the same way that the powerful telescopes of the Twentieth Century gave us a view of the physical universe that we could not see with our eyes alone. Just as there was a universe of galaxies that we cannot see on our own, there is a realm of "outer mathematics" that we cannot access on our own but can only access from outside ourselves.