Sunday, April 26, 2020

The Reason Why

So before I continue with potential solutions to the many problems I perceive surrounding existing formal logic systems (the implication operator, issues surrounding the introduction of time into a formal system and the concerns regarding replacing a binary logic system with a many valued logic system), which may ultimately bring us into the realm of modal logic (but I suspect its not that simple) I think it is best we take a step back and consider what it is we are trying to accomplish by formalizing any of this.

In other words, what is the "why" of a logical formal mathematical system and what questions we would like to try to answer once we have created such a thing. What is it we are trying to accomplish exactly. I suspect we are trying to address the following general areas …

1) Did something happen (past) or will something happen (future)?
(and is it the case that we can only answer for the present tense, or past or future?)

2) What is the certainty surrounding the derived answers (the probability) and is there any such thing as absolute certainty?

3) Is all of this somehow relative, either to the evaluator, the time or the place? Is it the case that everything is relative based on an evaluator's time and place and maybe even, who or what an evaluator is?

It would seem that predicting the future is harder than predicting the past but is this true and how do we demonstrate this. Clearly, if we could prove that something absolutely must happen given a minimum set of required inputs we could then predict the future with 100% certainty. Is it even possible to enumerate a minimum set of required inputs? Is it any easier to predict what did happen versus what will happen? How does probability enter into the equation. Can we ever say something is absolute?

To answer some of these questions let's go back to my perceived problems which may not be problems at all since I tend to be a simple uneducated layperson with nothing but questions and little formal education. 

Probability can sometimes be expressed as 'necessity' and 'possibility'. When we look at modal systems we will see one can be expressed in terms of the other so I don't think this is a problem we can not overcome with already understood methods. Perhaps many valued logic is solved using a predicate of this nature.

When we further investigate the concept of time. we discover that building on A.N. Prior's work Saul Kripe seems to have proposed a solution to this issue and so have Rescher and Urquhart so one might be inclined to think this is just a matter of formalism and the logical conclusions derived from using their systems. Considering Rescher was a leading researcher into many valued logic, perhaps this solution is tied to the solution of time as well. I need to understand these systems better to be certain, but I suspect the foundational problem is not time, especially if we constrain time to the past nor do I believe it to be probability, again if we constrain it to the past.

Is there anything which is absolute in our universe and if there is, can there be any operations for combining absolutes to continue the absolute chain of certainty or is everything probabilistic or does everything become probabilistic once we use operators to combine results. I suspect things which happened in the past are 100% probabilistic, the future, not so much. Is this even true though it would seem so at face value. Perhaps more important is the concept of semantic relationship. 

Revisiting the issues surrounding the implication operator we see this amounts to a semantic relationship (or lack thereof) between the antecedent and the consequent. Clearly these two must hold some valid provable relationship to each to overcome this problem, but it is fair to ask if this is simply a problem with formalism. I suspect not. I am not sure we can absolutely construct a system of semantic relationships that is consistent and complete. We tread into the boundary between mathematics and concepts when we investigate semantics and to quote Frege, "Concepts are areas with fuzzy boundaries." What are we left with when boundaries are ill defined?

It is often helpful to construct a thought experiment which can elucidate some of these issues. Let's use common sense to guide us and remove some difficult issues we are aware of to gain some insight. In other words, let's start with some low hanging fruit. 

To remove the issues of perspective, knowledge and time let's take a twig. A small piece of wood. Now if I burn this piece of wood it will at some point cease to be a piece of wood and will instead become ash. This ash may ultimately blow away and we are left with nothing from something. Let us not concern ourselves at the moment with the semantic relationship between a piece of wood and ash and what ash blown away by the wind is, since I suspect this is where we are ultimately heading. Let us simply capture this event using a video recording device. 

So if I film the event over time, of a piece of wood burning until it becomes ash and blows away in the wind I can certainly say this event did happen with 100% certainty. My video recording of this event is proof and so even if only I and a handful of others saw this in person we could certainly share this video recording with others. The fact that at some point in time this piece of wood did exist at some place (let's say my patio) and it no longer does, could be considered an indisputable fact of something that happened in the past (assuming we don't take into consideration fake or doctored videos, etc) at a certain location (many recording devices can also capture location) at a certain time (again, let's rely on the recording device's reporting of time) and again, let's not concern ourselves with relativity and observational issues.

This seems like the easiest thing to describe using a formal system and something where we can begin. Again, let me stress these points; excluding the consideration of relativity and the definition of observer. 

So clearly, a formal logic system which could represent this event as an absolute certainty would be a good place to start. We wish to construct a logical formal mathematical system which represents this event absolutely. It will always prove this did happen. It could prove that it is "necessary" that this did happen in the past. I keep stressing the term "necessity" because we will soon become exposed to this basic concept when we review what a modal logic system is. You can look up the difference between "necessary" and "possibly" under any introduction to modal logic systems if this is still gnawing at you (which I hope it is and I hope you do) and by constraining our research to the past, for now, we can perhaps make some progress in constructing a system to model reality. 

I think this simple thought experiment lays bare what we are really up against here though, and I don't know of any existing modal logic system which has completely solved this problem yet. While object oriented computer science can aid us a bit in the understanding of predicates such as 'is a' and 'has a' I believe the central predicate we will have to come to terms with is 'is to' and what we are trying to accomplish is the ability to handle the manipulation of concepts and what their 'is to' predicates are. I will refer to this as the semantic predicate or the semantic problem, and I suspect once we can solve this we will be in a much better place in beginning to construct a logical formal mathematical system which can be used to express reality. Using our above example, "ash 'is to' wood' as bla", is probably where we are heading. 

But we have some work to do because the current state of modal logic is a great place to start.





Modeling Reality, Introduction

Disclaimer:

As I near retirement (and because of recent employment woes) I seem to have some free time on my hands for the first time in a long time. When that happens I often get back to my hobbies, one of which has always been symbolic logic and modeling reality. I usually (as previous blog posts may show) attack this from a philosophical perspective, but sometimes try to get down to the mechanical aspects as well. This post is one of these instances. I will also try to get into Relevance Logic research in future posts but this is kind of a placeholder for me to go over the foundational needs for such research.



So when trying to model reality using logic we are currently fucked. I contend there are a variety of reasons for this, not the least of which is the inability to account for time in a formal logic system. Now it is also the case that folks like Godel have demonstrated that infinity also causes issues (damn you Georg Cantor), however, the universal (for all) and existential (there exists) predicates aid us a bit here, hence the rationale for what is sometimes referred to as first order predicate calculus.

Now it is probably also the case that if we had similar constructs for time we might be in better shape. Something like a temporal logic universal (it is always) and existential (sometimes) predicate could get us out of this hole but maybe 'before' and 'after' is where its at. Don't know yet.

But I am still of the belief that our reliance on just two states (true and false) also lead us down a rabbit hole. I have always believed life is not binary and just having true and false is a part of the basic problem. Indeed, in just the field of digital electronics we have three states (tri-state logic) where the third state (often referred to as Z or high impedance) becomes the unknown. I ascribe to the belief that we really may have an almost infinite number of states but I am really deficient in this area. Three or four may actually be good enough. I really don't know at this point.

So the solution long term seems to require some additional stuff we currently do not have in our tool chest.

Now A.N. Prior has done some rigorous work regarding temporal logic, and folks like Nicholas Resher have done good work in the field of many valued logic, so eventually we can lean on them to work out some of the issues arising from time and binary logic.

However, it is also the case that the way we do proofs in mathematical logic relies on what is often called the implication operator. Now the problems here have indeed been investigated and dealt with so I will briefly touch upon some of this work and try to remove this obstacle before we move on to more esoteric topics like temporal and many valued logic.

As is often the case, just stating the problem is sometimes difficult enough but always required before one can understand the solution.

For those who already understand the problem you can simply jump ahead to the topic of Relevant Logic (or relevance logic) and read the foundational works by C.I. Lewis, Ivan E. Orlov, , Wilhelm Ackerman, Alonzo Church, or jump ahead to the magnum opus of the subject, Entailment: The Logic of Relevance and Necessity by Nuel Belnap and Alan Ross Anderson which I pan on covering in future blog posts.

A decent understanding of formal logic systems, Fitch Charts and Boolean operators as applied to inductive reasoning might also be helpful (but not necessary) before you continue reading this text.

First we introduce the implication operator. It is sometimes called the 'implies' operator or the 'if then' operator. Its use is often called 'material implication'. It is used in formal systems but it has issues. First we will describe it and then look at some typical uses, then we will look at where it fails, and finally look at some potential solutions to these issues. To keep things simple, we will use '->' as the implication operator. We can read p->q as p implies q.

The truth table for this operator (note, it is a weak connective) is simply …

p  q  p->q
----------
T  T    T
T  F    F
F  T    T
F  F    T

When using the implication operator we often call the p argument the antecedent and the q argument the consequent.

When we apply this concept to human language we get the following basic definition.

"It is not the case that p is true and q false". Also, "p implies q" is equivalent to "p is false or q is true". Let's break this down a bit further.

"It is not the case that p is true and q false". This is the second row (T F F).
"p is false or q is true".
Well "p is false" are the last two rows.
"q is true" are rows one and three but we have an 'or p' here so this is actually row one.

For example, "if it is raining, then I will bring an umbrella", is equivalent to "it is not raining, or I will bring an umbrella, or both". This truth-functional interpretation of implication is called material implication or material conditional.

Enumerating this we get

(1) Our original statement (which we state is true).
p = T (it is raining)
q = T (I will bring an umbrella)
p->q = True

(2) Contradicts our original true statement, so its false.
p = T (it is raining)
q = F (I will not bring an umbrella)
p->q = False

(3) This is a bit odd. It arises from the fact that q is true, but just seems wrong.
p = F (it is not raining)
q = T (I will bring an umbrella)
p->q = True

(4) This seems consistent.
p = F (it is not raining)
q = F (I will not bring an umbrella)
p-> = True


Let's dig a little deeper and look at "p is false or q is true"

~p or q

This means not p or q (~ is the negation operator)

basically this says all we really give a shit about is q since anything 'OR' something is true if something is true. This can be rewritten as

(p or ~p) or q

So the first part of this seems to logically hold. In other words

q

And indeed when we look at our truth table we see that in all cases when q is true p->q is also true (rows one and three).

But this is also what we call a tautology since (p or ~p) is always true and as a result, we really don't care about q. In other words since (p or ~p) is always true, why bother with q as we already know the truth value of the entire statement; its always true.

Now let's keep in mind these two basic facts

(~p or p) = True
(~p and p) = False

Which translate to "anything OR the negation of anything" is always true and "anything AND the negation of anything" is always false. These are simply the definitions of the 'AND' and 'OR' Boolean operators.

So here we hit our first problem with using the implies operator. Its just not right. Sure when q is true the value of p->q is true, but if we substitute (~p and p) for p we get

(~p and p) -> q
F -> q

These can be found in the last two rows of our truth table for the implies operator. Put another way, looking at these two rows, False implies True is True (row three) but False implies False is also True (row four) which leaves us with a big WTF?

This is known as entailment and it arises from the principle of explosion which stated simply means an inconsistent premise always makes an argument valid. From a set theory perspective, if I can derive A and NOT A from a set, the set is said to be inconsistent because formally I can derive anything from this inconsistent set. In other words a contradiction must never prove to be true.

It is also problematic that if p is false it implies every q (again, last two rows in our truth table) because in this case q is said to be vacuously true (a universal statement that is only true because the antecedent can not be satisfied. For example, all cell phones in the room are turned off will be true even if there are no cell phones in the room).

So, even though formal logic (and all inductive proofs which rely on it) is somewhat lacking we can still use it to define things like relativity, but we also understand they are inherently flawed and we need to research this no further than Godel's Proof (the refutation of Peano's axioms) to know this is indeed the case.

Now that we understand the problem a bit better, we can move on to some potential solutions.

For those of you still reading this hokum, here is an interesting site. It is basically a Fitch Chart/Diagram helper script which can be used to try out some logic statements using many (not just the implies) operators for well formed formulas (WFFs).

http://www.harmendeweerd.nl/fitch/






Saturday, April 25, 2020

A Layperson's Brief Review of AI In My Lifetime

I was born in the 1958. My high school graduating class was 1976 and I attended college from 1979 though 1985, so this sets the foundation for my educational experience. 

My first job in the computer industry was as a software developer for a company named Computers 101 in Hollywood Florida where we sold micro computers and I wrote software applications for various customers. I eventually worked for IBM in the early 1980s in an Industrial Automation group, Amazon from 1999-2004 and many smaller companies in between. At one point while at IBM I worked on a robotic arm to paint a car fender coming down a conveyor belt. This could be considered the extent of my professional experience with AI. Not really AI at all. 

I have been interested in AI since my college days. My major was in Information Processing with a minor in computer systems. I did a directed independent study program with one of my professors, Marty Solomon in my last year. This was a probability first search algorithm written in LISP. Very simply put, this program would search for a result, once discovered it would go back to each node in the tree and update its probability for success searching for a particular category of goal. So the next search, rather than traversing the tree in a depth first or breath first manner would use this probability first approach. This may be considered the extent of my academic knowledge of AI. Again, not much at all.

As you might expect, I have been interested in AI since these days. My interests stemmed from the desire to model human thought and behavior more than the ability to have a machine learn for the sake of learning. They are actually two very different things. 

Training a machine to be more performant than a human or not attempting to solve a problem as a human does was never an interest. In the current time (around 2020) the concept of machine learning has become more of an attribute weighting approach, while writing code which writes code (what I tend to consider true AI) has not gotten much traction. It is the latter which I was always more interested in. It is the former which tends to be more productive and profitable. 

Now my first exposure to what was considered AI was a program called Eliza. This program was developed by a psychologist so obviously it held some interest to me. It was developed in the 1960s and was a somewhat simplistic program not much different than the old game program where you would ask a program a question and if it didn't know the answer it would ask you to tell it the answer and then it would store the answer and now it knew the answer. The next time it was asked the same question it would simply repeat the answer. 

So for example, you might ask the program what is a kangaroo. It would answer "I don't know" and then it would turn around and ask you the question, "what is a Kangaroo?" and you might answer "a kangaroo is a mammal" and the next time you ask the program what is a kangaroo it would respond with "a kangaroo is a mammal". This is a form of knowledge retention, but hardly artificial intelligence. It actually demonstrates the difference between knowledge and intelligence. 

The Eliza program was not much different. It tried to do some basic reasoning but its famous out when it didn't know something would be to ask "well how does that make you feel?". Pretty much what a psychologist would charge you for, so if nothing else, it was economical. 

Actually, in the 1950s Alan Turing proposed the Turing Test which is basically the belief that we have achieved artificial intelligence when, given the conversation between two entities (one a human, the other a computer), neither of which may be seen by a human evaluator, the evaluator can not tell the difference between the human and the computer. The Loebner Prize actually pays out a monetary award for the winner of an annual contest along these lines (https://en.wikipedia.org/wiki/Loebner_Prize) and reading through some of these transcripts is often entertaining as well as educational. For example, it has taught me this is no longer a valid test for artificial intelligence as I believe the goal of a Turing Test these days is to actually dumb down the computer participant. 

I'll give you a concrete example of what I mean. In one exchange (in a transcript from one Loebner contest) the human tells the computer "Oh, you are located in New York. I am located in Australia". The human then asks the computer "Are you East or West of me" to which the computer responds "both". A dead give away as computers are more logical than humans. Most humans would not answer in this manner, even though it is technically the correct answer.

Back in the 1980s, the programming language Prolog was a popular approach to creating what were known at the time as Expert Systems. This used something called "Horn Clause Logic" and was a grammar for expressing logic in this format. This is not much further advanced than Aristotelian Syllogisms except I believe it supported first order predicate calculus (the universal and existential operators) but was also a somewhat mechanical deduction approach. Possibly how humans think; probably not.  

Which brings me to my summary of what I believe is considered artificial intelligence these days. Keep in mind I have not been involved in AI in any capacity since my college days (about 40 years ago) or have I done any AI type coding nor any deep dives into any literature on the subject for many years, so at best this may be considered a layman's perspective. 

These days, it seems there are three basic approaches to AI though it is probable all use some methods of each. 

I will use (1) attribute weighting, (2) the popular Amazon product 'Alexa" and (3) the IBM product 'Watson' to discuss their basic differences as I have come to believe them to be. There are obviously other variants and different products I am not aware of, and I am sure some cross-pollination has occurred, however, these will suffice to demonstrate the fundamental differences as I see them. Again, keep in mind I have no in-depth knowledge of any of these examples and what I am about to explain is simply what I have come to believe from discussions with friends in the field. I have never interfaced with any of these three nor have I any internal insight into how they go about their business. Again, simply a layperson's perspective. 

Attribute weighting is much like my directed independent study approach mentioned above. A goal is provided and the code goes through the various attributes it uses to arrive at the correct answer and attempts to adjust the weight of the various attributes until it arrives at the correct conclusion using an adjusted set of attribute values (or weights). 

Alexa, is what I like to think of as a crowd sourced version of the game program described above (using the kangaroo example) so you ask Alexa a question, if it does not have the answer in its data store it will go out to a crowd of mechanical turks (see the Amazon Mechanical Turk program https://www.mturk.com/ for more information) and take the responses and figure out the most popular and add that to its data store. The next time the question is asked it will come from this data store.

Watson, tends to take a more syllogistic approach where it tries to use deductive reasoning to derive new facts from its data store of known facts. Much like the canonical example ...

All men are mortal.
Socrates is a man.
Therefore, Socrates is mortal.

Watson will search through its data store of known facts and attempt to derive new facts using the existing set of facts. If these new facts are indeed shown to be true, they get added back into the data store of known facts.