Johnny5 Posted April 10, 2005 Posted April 10, 2005 After talking with Rev, it became apparent, that I need to review linear transformations. Last night, I spent about 4 hours reading through 3 different books on linear algebra. All I want to do right now, is fully and competently understand what a linear transformation is. Here is where I am at, and would appreciate any help. Ok first of all... The first thing that became apparent to me, was that I need to be able to logically express the concept of "exactly one," as opposed to merely "there is at least one." And I am a stickler about my logic. So for right now, I would like know exactly how some of you address the issue of "exactly one" using symbolic logic. Let it be presumed that someone knows first order logic, and wishes to use the symbols from it to define the concept of a linear transformation. Now, in thinking of the graph of a function, you can see that for any x, there is one and only one f(x). But I want to say this using first order logic. So here is how I started off: [math] f: A \to B [/math] f,A,B, are sets. A is the domain of the function, B is called the codomain. [math] \forall x \in A \ \exists y \in B [ (x,y) \in f \ \& \ \neg \exists z \in B [(x,z) \in f \& \neg (z=y) ]][/math] Translation: For any x an element of the domain A, there is at least one element y of the codomain B, such that: The ordered pair (x,y) is an element of set f, and it is not the case that there is at least one z in the codomain B, such that (x,z) is in f, but z is different from y. (the z stuff was to make sure that any point x maps to one and only one f(x). ) Question: Did I express the meaning of function correctly with that? I don't want to know if you like the symbolism, I want to know if I got the meaning right. Thank you
Dave Posted April 10, 2005 Posted April 10, 2005 I don't quite understand what you're trying to accomplish with this. I'm not even sure you've got the idea of a linear transformation. From what I gather of your post (with the one and only one, you seem to think that a linear transformation has to be bijective, which isn't the case. So, in short, what are you getting at? As far as I'm aware, given two vector spaces U and V and a function [math]L : U \to V[/math], L is linear if: [math]L(\alpha \mathbf{u} + \beta \mathbf{w}) = \alpha L(\mathbf{u}) + \beta L(\mathbf{v})[/math] I just don't see the appeal of this method.
Johnny5 Posted April 10, 2005 Author Posted April 10, 2005 I don't quite understand what you're trying to accomplish with this. I'm not even sure you've got the idea of a linear transformation. From what I gather of your post (with the one and only one' date=' you seem to think that a linear transformation has to be bijective, which isn't the case. So, in short, what are you getting at? As far as I'm aware, given two vector spaces U and V and a function [math']L : U \to V[/math], L is linear if: [math]L(\alpha \mathbf{u} + \beta \mathbf{w}) = \alpha L(\mathbf{u}) + \beta L(\mathbf{v})[/math] I just don't see the appeal of this method. Oh sorry, i didnt say what i mean... In trying to understand "linear transformation" i got pushed backwards to understanding the meaning of "function." In trying to understand the meaning of "function" i realized I need to express "exactly one." In other words, the final goal is to understand linear transformation fully and competently, but i kept having to go backwards unfortunately. Hoffman Kunze were the authors of the book I was reading. Obviously, I started off reading the chapter on Linear Transformations, and saw what you have basically. I want to never forget this again.
Dave Posted April 10, 2005 Posted April 10, 2005 In trying to understand "linear transformation" i got pushed backwards to understanding the meaning of "function." Okay, this is fine. Just as long as you understand that there's not really a lot to them; they're just a way of corresponding elements in one set with elements in another set. In trying to understand the meaning of "function" i realized I need to express "exactly one." I don't get what you mean by "exactly one". I don't really see how this fits into the picture either. In other words, the final goal is to understand linear transformation, but i kept having to go backwards unfortunately. From personal experience - this won't get you anywhere fast. Concentrate on learning a single topic first, then move onto other things. Functions and set theory is a bit of a mountain by itself. If you want to learn about linear transformations, then you need to get your head around the concept of the vector space. You then need to concentrate on learning about linear independence, spanning and basis vectors. After this, you might want to detour and read up on subspaces, which are rather interesting. Then, move onto linear transformations and from there you can get a solid grounding.
Johnny5 Posted April 10, 2005 Author Posted April 10, 2005 Okay' date=' this is fine. Just as long as you understand that there's not really a lot to them; they're just a way of corresponding elements in one set with elements in another set. I don't get what you mean by "exactly one". I don't really see how this fits into the picture either. From personal experience - this won't get you anywhere fast. Concentrate on learning a single topic first, then move onto other things. Functions and set theory is a bit of a mountain by itself. If you want to learn about linear transformations, then you need to get your head around the concept of the vector space. You then need to concentrate on learning about linear independence, spanning and basis vectors. After this, you might want to detour and read up on subspaces, which are rather interesting. Then, move onto linear transformations and from there you can get a solid grounding.[/quote'] I've done all that before Dave, how can I say this... Ok here... I want to put a massive amount of information in me as fast as possible. Now, let me ask you something very simple. Is a linear transformation a function? And I know set theory, studied it quite intensely as a matter of fact. I got disgusted with ZFC, and I'll just stick with Naive set theory for now. I was reading a book by Patrick Suppes, "Axiomatic Set Theory" in fact, and he defined set as follows: [math] A \ is \ a \ set \Leftrightarrow (A= \emptyset \vee \exists x (x \in A)) [/math] The vector space axioms are in the book, looked at them all over again. There are two processes on a vector space. Vector addition, and scalar multiplication. Vector addition obeys the parallelogram law. And multiplication by a scalar changes the length, or direction of the vector. There was nothing particularly interesting in the first few theorems, so I didn't bother with them. You are missing what I need here... Just forget about anything I asked about linear transformations, and in your own words, tell me what a function is...
Dave Posted April 10, 2005 Posted April 10, 2005 There was nothing particularly interesting in the first few theorems, so I didn't bother with them. If you're not prepared to put up with the boring stuff don't expect to understand the interesting stuff as well as you should You are missing what I need here... Just forget about anything I asked about linear transformations, and in your own words, tell me what a function is... A function takes an element in some set A, and relates it to a function in a set B. That's about all there is to it, as far as I'm aware. Why start a thread on "linear algebra review" if what you want to learn has nothing to do with linear algebra?
Johnny5 Posted April 10, 2005 Author Posted April 10, 2005 If you're not prepared to put up with the boring stuff don't expect to understand the interesting stuff as well as you should A function takes an element in some set A' date=' and relates it to a function in a set B. That's about all there is to it, as far as I'm aware. Why start a thread on "linear algebra review" if what you want to learn has nothing to do with linear algebra?[/quote'] I will go back and regain whatever I need. Right now I am on 'function'.
Johnny5 Posted April 10, 2005 Author Posted April 10, 2005 If you're not prepared to put up with the boring stuff don't expect to understand the interesting stuff as well as you should A function takes an element in some set A' date=' and relates it to a function in a set B. That's about all there is to it, as far as I'm aware. Why start a thread on "linear algebra review" if what you want to learn has nothing to do with linear algebra?[/quote'] Dave, I need to be able to think about 'function' using first order logic. I need to be able to think clearly about it. Saying the graph passes the vertical line test simply doesn't cut it anymore. I need the definition of 'function' using first order logical symbols, and naive set theory. Be as perfect as you are capable of, please. Thank you very much
Dave Posted April 10, 2005 Posted April 10, 2005 Saying the graph passes the vertical line test simply doesn't cut it anymore. I never really implied that did I? If you want to define it a bit more thoroughly you might think about it in terms of an equivalence relation or some other binary relation. Have a look at: http://www.cs.odu.edu/~toida/nerzic/content/function/definitions.html
Johnny5 Posted April 10, 2005 Author Posted April 10, 2005 I never really implied that did I? If you want to define it a bit more thoroughly you might think about it in terms of an equivalence relation or some other binary relation. Have a look at: http://www.cs.odu.edu/~toida/nerzic/content/function/definitions.html Dave too much all at once. Can you simplify the definition? Forget about relation right now. I don't think you need to define 'function' using 'relation.' I want a string of first order logical symbols' date=' which communicates the idea of function rapidly. Something like this: Presume your student is comfortable with using coordinates to represent points in three dimensional space. ASSUME THAT, but little else. And you want to teach them what a function is, so that they never forget. Now, start out with the special case of XY plane. Now, do me a favor, and criticize my definition here: [b']Definition:[/b] f is a function from A into B if and only if [math] \forall x \in A \exists! y \in B [(x,y) \in f] [/math]
Dave Posted April 10, 2005 Posted April 10, 2005 My problem with that definition is that it doesn't really imply how f acts on x. I don't quite understand the notation of B[(x,y) in f] either. At some point, you're going to have to start defining things in terms of other things. In my opinion, relations are rather a nice, neat and concise way of doing this and follow pretty much from set theory. I don't feel that I'm qualified enough to say much else, I'm afraid. I haven't worked with set theory very much at all over the past few years and my first order logic is rather sketchy at best. FYI, if a student was comfortable with spacial co-ordinates and not much else, I wouldn't start throwing abstract maths at them
Johnny5 Posted April 10, 2005 Author Posted April 10, 2005 My problem with that definition is that it doesn't really imply how f acts on x. I don't quite understand the notation of B[(x' date=y) in f] either. At some point, you're going to have to start defining things in terms of other things. In my opinion, relations are rather a nice, neat and concise way of doing this and follow pretty much from set theory. I don't feel that I'm qualified enough to say much else, I'm afraid. I haven't worked with set theory very much at all over the past few years and my first order logic is rather sketchy at best. FYI, if a student was comfortable with spacial co-ordinates and not much else, I wouldn't start throwing abstract maths at them Dave... B was just the codomain. (x,y) is an ordered pair. And i have decided to use [math] \exists! [/math] whenever I want to say "there is exactly one," they have the notation, and its rarely used. It minimizes the mental effort required to interpret what is being looked at. Translation: f is a function that maps A into B if and only if For any x an element of A, there is one and only one element y of B, such that the ordered pair (x,y) is an element of f. Dave, I need something using first order logic please. You win. Ok I read this here Mathworld definition of Relation Luckily I know what a Cartesian product is. So let me ask you if you can define 'relation' using first order logical symbols please? I can have a try at is i guess... [math] A \times B = \mathcal{f}(x,y) | x \in A \ \& \y \in B \mathcal{g} [/math] Translation: The cartesian product of two sets A,B, is the set of all ordered pairs (x,y) such that x comes from set A, and y is an element of set B. It is the set of all points in the XY plane. A binary relation from A to B, is a subset of A X B, and A relation on A, is a subset of AXA.
Johnny5 Posted April 11, 2005 Author Posted April 11, 2005 Definition: A linear equation with associated field [math] \mathbb{F} [/math] is an equation of the following form: [math] a_1 x_1+a_2x_2+... +a_nx_n = b [/math] The x terms are the variables of the equation, and the other terms are constants. The a terms are called coefficients. All the constants are elements of the field [math] \mathbb{F} [/math]. The two most common fields are the real numbers [math] \mathbb{R} [/math], and the complex numbers [math] \mathbb{C} [/math]. Consider the case where n=2, and [math] \mathbb{F=R} [/math]. For this case we have: [math] a_1 x +a_2y = b [/math] Where [math] a_1,a_2,b \in \mathbb{R} [/math]. This is the equation of a straight line in the XY plane. Once the variables are instantiated with real numbers, the resulting expression is an example of a mathematical statement, and as with any statement, it is either true or false. Before the variables are instantiated, it's not a statement. After the variables have been instantiated, either the LHS will equal the RHS or not. In the case that the LHS=RHS the statement is true, otherwise the statement is false. The point is, we can use binary logic to analyze linear equations over a field. Now, a straight line in the XY plane is actually a subset of the entire plane. We can define the XY plane, using the set theoretic notion of the Cartesian product. Definition: The cartesian product A X B, is the set of all two tuples (x,y) where x is an element of set A, and y is an element of set B. And we can write this symbolically, as follows: [math] \mathcal{f} (x,y)| x \in A \ \& \ y \in B \mathcal{g} [/math] So in the case of the XY plane, set A as well as set B are the real number system [math] \mathbb{R} [/math]. So, for the XY plane we have: [math] XY \ \ plane = \mathcal{f} (x,y)| x \in \mathbb{R} \ \& \ y \in \mathbb{R} \mathcal{g} [/math] So, the XY plane is the set of all ordered pairs (x,y), where x is an element of the real number system, and y is an element of the real number system. Now, the equation [math] a_1x + a_2 y = b [/math] is going to be a true statement for some ordered pairs, and a false statement for others. But the point is, that if you plot all the points (x,y) which make the equation a true statement, rather than a false statement, a straight line emerges. Now, it may occur to you that two points determine a unique straight line. So if you want to check this, you can use graph paper. First plot two points, then draw the straight line through them. In principle, the line you drew is the set of all points for which the equation is true. All other points in the XY plane are locations where the equation is false. However real space is three dimensional, and we may wonder how to express a formula for an arbitrary straight line in three dimensional Euclidean space. Before tackling that question, let's first consider what is called a system of simultaneous linear equations. It may occur to you, that given two arbitrary staight lines in the XY plane, that either those straight lines are parallel, or not. If two straight lines in the XY plane are not parallel, then they must intersect, and conversely if two lines in the XY plane intersect then they aren't parallel. So now, suppose that you are given two straight lines in the XY plane chosen at random by your professor. Suppose he gave you the following: [math] L1 \equiv \mathcal {f} (x,y)| y=3x+1 \mathcal {g} [/math] Thus, straight line L1 is the set of all points (x,y) such that the equation y=3x+1 is true. Some of you will recognize the equation as the point/slope form of a straight line. In an equation of the form y=mx+b, m is the slope of the straight line, and b is the y-intercept of the line. Now, here is the other line he chose: [math] L2 \equiv \mathcal {f} (x,y)| y=2x-4 \mathcal {g} [/math] Now, here is the test question: Do the two given straight lines intersect, and if so, where? At this point, you want to solve a system of simultaneous linear equations. Now, there is already a standard procedure for doing this, but it involves matrix algebra. But you are bright student, and you already know it. Nonetheless, lets pretend you don't, and solve a problem or two, perhaps you may need a refresher. First write the two equations he gave you as follows: [math] y-3x=1 [/math] [math] y-2x=4 [/math] Now, utilize matrix notation, and write the linear system as follows: [math]\left[ \begin{array}{ccc} 1 & -3\\ 1 & -2\\ \end{array} \right] \left[ \begin{array}{c} x\\ y\\ \end{array} \right] = \left[ \begin{array}{c} 1\\ 4\\ \end{array} \right] [/math] The first matrix is usually called the coefficient matrix, for obvious reasons. In this case here, we have a two-by-two matrix. Now, in order to answer your test question, write the system in the following form: [math]\left[ \begin{array}{ccc} 1 & -3\\ 1 & -2\\ \end{array} \right| \left \begin{array}{c} 1\\ 4\\ \end{array} \right] [/math] Now, you have to perform a finite sequence of elementary row operations, to get the system into the following form: [math]\left[ \begin{array}{ccc} 1 & 0\\ 0 & 1\\ \end{array} \right| \left \begin{array}{c} a\\ b\\ \end{array} \right] [/math] There are three standard elementary row operations they are: 1. Multiplication of any row by a scalar, this is called scalar multiplication. 2. Row addition (In this process, you add the elements of one row, to the elements of another row of your choice.) 3. Row interchange (Here, all you do is switch two rows) So let us solve your professor's system of simultaneous linear equations, to determine whether or not these two straight lines are parallel, and if not, then where in the XY plane they intersect. We shall do this by performing a finite sequence of elementary row operations. [math]\left[ \begin{array}{ccc} 1 & -3\\ 1 & -2\\ \end{array} \right| \left \begin{array}{c} 1\\ 4\\ \end{array} \right]= \left[ \begin{array}{ccc} 1 & -3\\ -1 & 2\\ \end{array} \right| \left \begin{array}{c} 1\\ -4\\ \end{array} \right] [/math] The RHS of the equation above, was obtained using the elementary row operation known as scalar multiplication. The second row of the matrix equation was multiplied by the scalar -1. [math]\left[ \begin{array}{ccc} 1 & -3\\ -1 & 2\\ \end{array} \right| \left \begin{array}{c} 1\\ -4\\ \end{array} \right]= \left[ \begin{array}{ccc} 1 & -3\\ 0 & -1\\ \end{array} \right| \left \begin{array}{c} 1\\ -3\\ \end{array} \right] [/math] The RHS of the equation above, was obtained using the elementary row operation known as row addition. The first row was added to the second row. [math]\left[ \begin{array}{ccc} 1 & -3\\ 0 & -1\\ \end{array} \right| \left \begin{array}{c} 1\\ -3\\ \end{array} \right]= \left[ \begin{array}{ccc} 1 & -3\\ 0 & 1\\ \end{array} \right| \left \begin{array}{c} 1\\ 3\\ \end{array} \right] [/math] The RHS of the equation above, was obtained using the elementary row operation known as scalar multiplication. The second row was multiplied by negative one. [math]\left[ \begin{array}{ccc} 1 & -3\\ 0 & 1\\ \end{array} \right| \left \begin{array}{c} 1\\ 3\\ \end{array} \right]= \left[ \begin{array}{ccc} 1 & -2\\ 0 & 1\\ \end{array} \right| \left \begin{array}{c} 4\\ 3\\ \end{array} \right] [/math] The RHS of the equation above, was obtained using row addition, row two was added to row one. Now, we can streamline the process, if we are willing to perform two or more elementary row operations at a time. In the next step, scalar multiplication, as well as row addition will be used in the same step. First row two will be multiplied by two, and then the result will be added to row one. [math]\left[ \begin{array}{ccc} 1 & -2\\ 0 & 1\\ \end{array} \right| \left \begin{array}{c} 4\\ 3\\ \end{array} \right]= \left[ \begin{array}{ccc} 1 & 0\\ 0 & 1\\ \end{array} \right| \left \begin{array}{c} 10\\ 3\\ \end{array} \right] [/math] Notice that the coefficent matrix is now the 2X2 identity matrix I. Now, through the repeated use of the transitive property of equality, it follows that: [math]\left[ \begin{array}{ccc} 1 & -3\\ 1 & -2\\ \end{array} \right| \left \begin{array}{c} 1\\ 4\\ \end{array} \right]= \left[ \begin{array}{ccc} 1 & 0\\ 0 & 1\\ \end{array} \right| \left \begin{array}{c} 10\\ 3\\ \end{array} \right] [/math] Now, write the RHS of the equation above, as a matrix equation: [math]\left[ \begin{array}{ccc} 1 & 0\\ 0 & 1\\ \end{array} \right ] \left [ \begin{array}{c} x\\ y\\ \end{array} \right] = \left [ \begin{array}{c} 10\\ 3\\ \end{array} \right] [/math] Now, convert back to linear equations: 1x+0y=10 0x+1y=3 From which we can rapidly infer that: x=10 and y=3, is the solution of the original system of simultaneous linear equations. What this means geometrically, is that the two given straight lines are not parallel, instead they intersect at one and only one point in the XY plane, and that point is the ordered pair (10,3). And we are done. Now, in the example above, we converted from matrix equations to linear equations, at the end of the problem. The name of the procedure for doing this is called, matrix multiplication. The basis of matrix multiplication, is the dot product of two vectors. Here is the definition of the dot product of two vectors: Definition: [math] \vec A \bullet \vec B \equiv |\vec A| |\vec B| cos(\vec A,\vec B) [/math] Translation: The dot product of vector A with vector B, is defined to be the magnitude of vector A, times the magnitude of vector B, times the cosine of the angle between the two vectors. Matrix Multiplication Computing the product of two matrices is actually quite simple, if you know the dot product. I will quickly review the dot product of two arbitrary vectors in three dimensional Euclidean space. Suppose you are asked to compute the dot product of the following two vectors: [math] 3 \hat i + 2 \hat j - 7 \hat k [/math] [math] 2 \hat i + 5 \hat j + 3 \hat k [/math] What is needed now, is a formula which explains how to compute the dot product, in terms of the x,y,z components of the vectors in your frame. Now, the magnitude of a vector comes from the three dimensional version of the Pythagorean theorem. The three dimensional version actually comes from the two dimensional version, so if you know the two dimensional form of the Pythagorean theorem, then you can easily understand the three dimensional version. I will briefly digress, to prove this. An arbitrary position vector R in a given reference frame S, can be looked at, as the sum of three vectors, one purely in the i^ direction, another purely in the j^ direction, and the last in the purely k^ direction. That is, [math] \vec R = x\hat i + y \hat j + z \hat k [/math] So temporarily ignore the k^ direction, and focus on the projection of R onto the XY plane of the frame. [math] \vec R = (x\hat i + y \hat j) + z \hat k [/math] [math] proj(\vec R)_{XY} = (x\hat i + y \hat j) [/math] The projection of R onto the XY plane, is itself is a vector entirely in the XY plane. For the sake of simplicity, suppose that the head of the projection vector lies in the first quadrant of the XY plane. Thus, this vector is the hypotenuse of a right triangle, all the points of which lie in the XY plane. Using the definition of vector addition, the following vector equation is a true statement: [math] \vec h = x \hat i + y \hat j [/math] Where the magnitude of h vector can be clearly seen to come from the Pythagorean theorem. And of course h vector, is the vector projection of R onto the XY plane, that is: [math] \vec h \equiv proj(\vec R)_{XY} [/math] So we now have a vector triangle, with the following sides: [math] 1. \ x \hat i [/math] [math] 2. \ y \hat j [/math] [math] 3. \ \vec h [/math] The magnitude of the first vector is x. The magnitude of the second vector is y. And we treat the magnitude of vector h as unknown, for now. The proof that the triangle contains a right angle is as follows: One side of the triangle has length x, and points in the i^ direction. As a vector, it's tail is located at the origin of the frame, and its head is a distance x away from the origin, at location (x,0,0) in the frame. Then there is another vector, whose tail is located at (x,0,0) in the frame, but whose head is located somewhere in the first quadrant of the frame. And, by stipulation, it's direction is equivalent to j^. So in order to prove that the angle between these two vectors is a right angle, it suffices to prove that the angle between i^ and j^ is a right angle. Now, in the design of the reference frame, the Y axis is stipulated to be perpendicular to the X axis. And what that means comes directly from Euclid. It means this... it means that a straight line has fallen upon another straight line, and made the adjacent angles equal. And when this happens, each of the angles is called a right angle, and all right angles are equal Euclid Postulate 4. The point is this, the leg of the triangle with length y, lies upon a straight line which is parallel to the y axis. It cannot be the y axis, because it contains a point which isn't on the y axis, namely the point located at (x,0,0) in the frame. In the limit as x approaches zero, this line becomes the Y axis. Now, if we let x reach zero, then this line will totally coincide with the y axis of the frame, and therefore, its angle inside the triangle will be equal to the angle the Y axis makes with the X axis, which is stipulated to be right. Therefore, there is an angle inside the triangle, which is a right angle. And now, by the Pythagorean theorem, the sum of areas of squares constructed on the legs of the triangle, will be equivalent to the area of a square constructed on the hypotenuse. Now, here is the definition of the area of a square: Definition: Let S denote the length of a side of a square. The area of the square is defined to be equal to the product of S with itself. As a formula... [math] A_{square} = S \cdot S = S^2 [/math] So one leg of the right triangle has length x, therefore, the square on that leg has area x2. Another leg of the right triangle has length y, therefore, a square constructed on that leg has area y2. The sum of these two areas is: [math] x^2 + y^2 [/math] The area of the square on the hypotenuse is, by definition, equal to: [math] h^2 [/math] And by the Pythagorean theorem , they are equal. That is... [math] x^2 + y^2 = h^2 [/math] And now, we can solve for h, by taking the square root of each side of the equation above. By the theory of equations, we must get two answers which make the statement above true, one positive, and one negative. But distance is purely a positive quantity, so using the interpretation that h is the length of the side of something, h must be positive. [math] h = \pm \sqrt{x^2 + y^2} [/math] Disregarding the negative root, we have: [math] h = \sqrt{x^2+y^2} [/math] So, the formula above for h, is equal to the length of the projection of R, onto the XY plane. Now, the point of this digression, was to demonstrate the three dimensional version of the Pythagorean theorem. Now, the hypotenuse of the right triangle, can also be viewed as a vector, whose tail is located at the origin, and whose head is located somewhere in the first quadrant of the XY plane. And we have expressed this using vector notation, as follows: [math] \vec h [/math] Now, R vector also has its tail located at the origin of the frame, however, the head of R vector lies outside of the XY plane. In fact, the straight line which contains R vector intersects the XY plane at one and only one point, namely the origin of the frame. To summarize: Both R, and h have their tails located at the origin, but their heads are located at two different points in space, or perhaps I should say in the frame. So there must be a vector triangle that can be constructed as follows: [math] \vec h + z \hat k = \vec R [/math] Because, now, by simple substitution, we get back the original definition of R... [math] \vec h + z \hat k = \vec R [/math] [math] \vec h = x \hat i + y \hat j [/math] [math] \therefore [/math] [math] \vec R = x \hat i + y \hat j + z \hat k [/math] So now, there is another right triangle formed. I know this is getting a bit long, but it can be simplified, that can occur later, once we have a formula for the dot product. In this second right triangle, the legs are: [math] \vec h [/math] [math] z \hat k [/math] And the hypotenuse is [math] \vec R [/math] It would be nice to first show that the triangle contains a right angle. Then we know that the Pythagorean theorem holds for it. The direction of any vector which lies on the Z axis of the frame, is stipulated to be j^. Now, the length of [math] z \hat k [/math] is just z, and its direction is also identical to that of any vector which lies entirely upon the Z axis of the frame. Now, what needs to get done, is to show that the angle between vector h, and vector zk^, is a right angle. The way I will accomplish this, is to resort to using Euclid. First, notice that three non-collinear points lie in exactly one plane. Now, focus on the following three points in the frame: 1. The origin. 2. The head of [math] \vec R [/math] 3. The head of [math] \vec h [/math] Now, focus on the unique plane, which contains these three points. Now, here is Euclid's second theorem: Euclid, Book I, proposition 2 Now Euclid's second theorem is actually an ingenious construction. In proposition two, he makes immediate use of proposition one, which was just proven rigorously by him. Now here is Heath's translation of the ancient Greek, if I'm remembering it right... "To place a straight line equal to a given finite straight line at a given point as an extremity." But, the meaning of the sentence comes from the construction. You are given an arbitrary point, and you are given an arbitrary straight line. And the given point is not an extremity of the given straight line. And you are asked to construct a straight line, using the given point as an extremity, such that the line you construct has a length which is equal to the length of the line you were given. And Euclid's second proposition shows you how to do this. Now, in Euclid, there is a distinction made between straight line segments, and infinite straight lines. In Euclid Prop2, you are given a finite straight line, and a point not on that line. Now, a finite straight line, and a point not on it, lie in one and only one plane. And this construction takes place in the unique plane which contains the given point, and the given finite straight line. So now back to our problem taking place in a frame. There were three points to currently focus upon, the origin of the frame, the head of h vector, and the head of R vector. Those three points form a triangle, once you join each point to the other. And they are three non-collinear points as well, which means they lie in one and only one plane. Using Euclid's second proposition, where the given finite straight line is [math] z \hat k [/math] and the given point is the origin of the frame, construct a straight line equal to the given straight line, with the origin of the frame as an extremity. You can disregard the direction of the given vector. After you have done that, it is unlikely that the straight line you just constructed coincides with the z axis of the frame. You can now utilize Euclid's third postulate, which is this: Euclid, Postulate 3 Translation: You can construct a circle with any point as center, and any radius you desire. So now, using the straight line which was just constructed with the origin of the frame as an extremity, describe a circle, in the same plane the earlier construction was carried out in, such that it's center is the origin of the frame, and use the line just constructed as the radius. Now, the circle and the z axis will intersect at exactly two points. One of them will be closer to the head of [math] \vec R [/math] than the other. Call the one which is closer point B, and call the tip of R vector umm call it point C. Now, join point B to point C, this step actually makes use of Euclid's first axiom, which is this: Euclid, Postulate 1 Now, call the origin of the frame point A. The goal is to now show that triangle A,B,C is equivalent to the vector triangle defined by: [math] \vec h + z\hat k = \vec R [/math] Because then if you do that, the corresponding parts of congruent triangles are congruent, and there will be more than enough information available to prove that the vector triangle contains a right angle. Let the frame be a Right handed frame. Now denote the angle between vector AB, and vector AC by [math] \phi[/math], that you are consistent with this . The next goal, is to show that angle ABC is a right angle. Call the head of [math] \vec h [/math] point D. Now look at the following two propositions of Euclid: Euclid, Book 1, proposition 29 Euclid, Book 1, Proposition 27 Proposition 27 of Euclid, says this: If a transversal cuts two straight lines, and makes alternate interior angles equal to one another, then the two straight lines are parallel. Proposition 29 of Euclid says this: If a transversal cuts two parallel lines, then the alternate interior angles are equal to one another. So proposition 29 is essentially the converse of proposition 27. Thus, the two propositions can be combined, using the "if and only if" of binary logic. In other words: Let L1,L2, and L3 be three given straight lines, and let it be the case that L3 intersects L1, and L3 also intersects L2. The alternate interior angles which were formed are equal if an only if L1 and L2 are parallel. Suppose that we can show that vector BC is parallel to vector AD. Then it will follow that angle BAC is equal to angle ACD. And angle BAC has also been called [math] \phi [/math], or if you prefer, the measure of angle BAC is equal to [math] \phi [/math]. Therefore, it would be the case that the measure of angle ACD is equal to phi. Now, Euclid showed that the sum of the interior angles of a triangle is 180 degrees in the following proposition: Euclid, Book 1, Proposition 32 There is a theorem needed, that would make what I am trying to do really really easy. Actually, I'm thinking of a sequence of proofs. There seems to be a concept missing. Something to do with the notion of different vectors which point at the same location in space. Suppose you have infinite straight line L1. Pick two points on it at random, point A, and point B. The distance between them can be measured by a ruler. Now, suppose that we map the real number system onto line L1. Line L1 is now a coordinate axis, call it the X axis of a frame. Let A denote the point 0, of the real number system, and let the coordinate of point B be some arbitrary positive real number. Distance is a strictly nonnegative quantity. The distance between point A, and point B, is the real number which corresponds to the position of point B, minus the real number which corresponds to the position of point A. So let X2 denote the x coordinate of point B, and let X1 denote the x coordinate of point A. The distance between them, is [math] d(A,B) = X2-X1 [/math] I haven't yet made use of a fact I am very well aware of, which is: The vector [math] z \hat k [/math] is parallel to any vector on the Z axis. And now, if i use Euclid's postulates 27, and 29, then any transversal which cuts the infinite straight line containing vector zk^ and the Z axis, will make the alternate interior angles equal. So that if the line through AD meets the z axis at a right angle, then angle ADC must also be a right angle. And then I think anything else I want to prove will just follow. I think the problem is arising, because I have not paid attention to the exact position of points in the frame. And ultimately, the whole point of this excercise is to discover the formula for the dot product. Ok what should have been done, is to express the distance between two arbitrary points in a plane, in terms of their coordinates. So that is just what I am going to do. Construction of a three dimensional reference frame I went away and thought about the problem for a bit, and decided the proper way to solve the problem, is to simply first discuss a three dimensional reference frame. Only introduce as many terms as are necessary to fully describe a frame. Suppose you are given two points in space A,B. By Euclid's first postulate, you can construct the straight line from A to B. Now, a straight line is not a vector. A vector is a more powerful concept, than that of a straight line, because a vector has both magnitude and direction. Now, you were given two points, but you were allowed to go from any point to any other point, by Euclid Postulate 1. So there are two possible ways that you could have constructed the straight line. Either you could have drawn from A towards B, or you could have drawn from B towards A. I do not believe that Euclid distinguished between these two. We can afford to be more general than Euclid. If you drew from A to B, then you constructed the vector from A to B, which can be written as: [math] \vec{AB} [/math] If you drew from B to A, then you constructed the vector from B to A, which can be written as: [math] \vec{BA} [/math] Now, the distance between any two points in a frame, is a temporal constant, a quantity that never changes for any reason. In other words, in any frame, the distance between two randomly chosen points, is constant in time. So there is no need to ever write that the distance between two points in one frame is a function of the temporal variable t. And this can be made a postulate. Now, I should have said, let the given points be in the same frame. I wasn't initially clear enough. You are given two points A,B in one frame of reference S. Thus, it is impossible for two points in frame S to be in relative motion. Two points in different frames can be in relative motion, but not two points in the same frame. Here is Euclid's first postulate... yes again. Euclid's Elements Postulate One Let's try another source... Here it is in the actual ancient Greek it was written in... Euclid's First Postulate, J.L. Heiberg Pronunciation (i happen to know a tiny bit of ancient Greek) (ee-tee-sthaw) (ah-paw) (pahn-dos) (sih-may-you) (epi) (pan) (see-may-awn) (ef-thay-ee-ahn) (gra-meen) (ah-yga-ygeen) yg is a gutteral sound, not an english 'g' sound, more gutteral, kind of like the 'g' sound, but not exactly. According to the source at Perseus project, the first word of Euclid's sentence could mean: ask for, demand, assume I don't think it means any of these. Here is the modern greek word for assume [math] \upsilon \pi o \theta \epsilon \tau \omega [/math] Pronunciation: (ee-paw-theh-taw) This has clearly become the English word 'hypothesize' Here is a link to a Greek English Dictionary So I don't think Euclid meant "assume" that... he meant something else. Let me skip the first word, and move onto the second... Using an online dictionary, the modern greek word [math] \alpha \pi o[/math] can mean 'of', by' or 'from'. Moving on to the third word... Actually, I just found this... The greek word [math] \eta \tau \iota \sigma [/math] means 'any'. With that in mind, have a look at the first word again, only this time using Greek letters: [math] H \tau \iota \sigma \theta o [/math] The first letter of this word, is the capital Greek letter 'Eta', which in lowercase would read... [math] \eta \tau \iota \sigma \theta o [/math] A common practice in ancient civilizations, was to formulate a new word using two old words. In that case, we can analyze this first word of Euclid further... [math] \eta \tau \iota \sigma -\theta o [/math] Now sometimes, the endings of words just varied for grammatical reasons, so I still don't know the meaning of the first word. Quite possibly, the meaning of the first word of this sentence is identical to the meaning of the universal quantifier of first order logic, that is... [math] \forall \equiv \text{for any} [/math] But i don't know yet. Let me move on to the third word of Euclid's sentence... [math] \pi \alpha \nu \tau o \sigma [/math] pronounced (pahn-dos) Now, this word is a still common word in modern greek, and it means 'always'. Moving onto the fourth word, its spelling in Greek is: [math] \sigma \upsilon \mu \epsilon \iota o \upsilon [/math] And here is the modern Greek word for 'point' [math] \sigma \upsilon \mu \epsilon \iota o [/math] I'm not sure if the first one is plural or not, because it's ancient Greek, not modern Greek, but it means either 'point' or 'points' or maybe set of points, not sure yet. Let me move on to the fifth word, its spelling in ancient Greek would be... [math] \epsilon \pi \iota [/math] Some possible meanings(source online dictionary): on, upon, over, onto Moving on to the sixth word, its spelling in ancient Greek was: [math] \pi \alpha \nu [/math] Possible meanings(source online dictionary): All, entire, whole Moving on to the seventh word, its spelling in Greek is: [math] \sigma \upsilon \mu \epsilon \iota o \nu [/math] This word probably means 'point' in the singular. Then ending of the word.. 'on' is a masculine ending. Moving on to the eigth word, its spelling in Greek is: [math] \epsilon \upsilon \theta \epsilon \iota \alpha [/math] Now, the modern Greek word for 'straight', from the online dictionary, is this: [math] \epsilon \upsilon \theta \epsilon \upsilon \sigma [/math] They sound practically the same, and are probably synonymous. Moving on to the ninth word, its spelling in Greek is: [math] \gamma \rho \alpha \mu \mu \eta \nu [/math] Using an online dictionary, the modern greek word for 'line' is: [math] \gamma \rho \alpha \mu \mu \eta [/math] The only difference, is that there is no 'n' at the end of this word, i'd call it irrelevent, so they mean the same thing. And now for the tenth and last word; its spelling in Greek is: [math] \alpha \gamma \alpha \gamma \eta \nu [/math] This word isn't in the online dictionary, and the LSJ at Perseus doesn't have it either. Here is the original sentence as close to ancient Greek as I can get it: [math] H \tau \iota \sigma \theta o \ \alpha \pi o \ \pi \alpha \nu \tau o \sigma \ \sigma \upsilon \mu \epsilon \iota o \upsilon \ \epsilon \pi \iota \ \pi \alpha \nu \ \sigma \upsilon \mu \epsilon \iota o \nu \ \epsilon \upsilon \theta \epsilon \iota \alpha \ \gamma \rho \alpha \mu \mu \eta [/math] [math] \ \alpha \gamma \alpha \gamma \eta \nu \text{.} [/math] Now, I will just try a one-to-one translation, and see what happens: Any-tho for always points onto whole point straight line agagein(?). That last word appears to be very important, yet I have no idea what it means or meant... I have no idea how Heath arrived at his translation, and he left out a temporal logical word 'always.' I doubt that matters in the meaning of the sentence though. -------------------------------------------------------------------- I put the line here, because everything before it was written yesterday. Last night, I thought about the first word of Euclid's sentence, I am now attempting to translate his very first postulate, from Ancient Greek into modern English. The modern Greek word for 'give' is: [math] \delta \omega \sigma [/math] Pronunciation: {thawz} Here is a link to the LSJ dictionary at Perseus project, which gives the ancient Greek word for the English word 'give'. Look down the list, and you will find the ancient Greek word: [math] \delta \omega \sigma \omega \nu [/math] Pronounced: {thaw-sawn} (It's not exactly the English 'th' sound, its more like the sound you make when you pronounce the English word 'the') I realized that I used the wrong last letter, on the first word of the sentence. It should have been an omega, not an omicron. In other words, the first word of the postulate is: [math] H \tau \iota \sigma \delta \omega [/math] And I know that the word: [math] \eta \tau \iota \sigma [/math] means 'any' in English. And the greek letter 'eta' looks like the roman letter H So, I thought about the possibility that the first word of the postulate is actually composed of two greek words, in other words, I want to carry out a word etymology, but only in Greek. So we have this: [math] \eta \tau \iota \sigma+\theta \omega [/math] Now, since the greek word for 'give' is: [math] \delta \omega \sigma [/math] It appears to me, that the very first word, of Euclid's first of five postulates, tranlates into English as follows: [math] H \tau \iota \sigma \theta \omega \equiv \text{Given any} [/math] And even if it should be an omicron and not an omega, it doesn't matter, because I know I have the pronunciation right in both cases. The sounds are the same for sure. And also, the meaning of the word now makes sense in the sentence. So i was right. The first word of Euclid's first sentence translates into English as... The universal quantifier of first order logic. [math] H \tau \iota \sigma \theta \omega \equiv \text{Given any} \equiv \forall [/math] I continued the translation further. The next few words are: [math] \alpha \pi o \ \pi \alpha \nu \tau o \sigma \ \sigma \upsilon \mu \epsilon \iota o \upsilon [/math] the possible translations of 'apo' are: of,by, from Now Heath translated it as, "from", but I think it means 'of'. Heath translated [math] \pi \alpha \nu \tau o \sigma [/math] as 'any', but its the modern Greek word for 'always.' But the word 'always' just doesn't make any sense in the sentence, no matter how you look at it. But, the word [math] \pi \alpha \nu [/math] translates as 'all or 'whole' or entire, and it happens to be the prefix of the word [math] \pi \alpha \nu \tau o \sigma [/math]. So I just decided to translate [math] \pi \alpha \nu \tau o \sigma [/math] as 'all' instead of 'always,' and I got this... [Given any of all points] As a partial translation. It is as if Euclid was using set theory to reason, over two thousand years ago. He postulated a set of points. And then began his first postulate as... Given any (element of the set of) all points... Here is the rest of his sentence... [math] \epsilon \pi \iota \ \pi \alpha \nu \ \sigma \upsilon \mu \epsilon \iota o \nu \ \epsilon \upsilon \theta \epsilon \iota \alpha \ \gamma \rho \alpha \mu \mu \eta \ \alpha \gamma \alpha \gamma \eta \nu [/math] Given any of all points, to any point a straight line can be made. (if [math] \sigma \upsilon \mu \epsilon \iota o \nu [/math] is singular) Or possibly Given any of all points, to all points a straight line can be made. (if [math] \sigma \upsilon \mu \epsilon \iota o \nu [/math] is plural) Now, I had to totally guess on the last word, but I figured it means something like 'join' 'connect' 'draw' 'attach' 'construct' 'produce' something like that, because Heath translated the basic meaning correctly. Any way enough of that, back to review of linear algebra. For a reason that will appear clear later, I am now studying triangle similarity. Here are some links I am looking at. Similar triangles 1 Similar triangles 2 Here is a nice one... Similiar triangles 3 Question: Given two triangles, how can you logically analyze them, to inevitably know later on, that they are similiar triangles, or not? First you need a definition of similiar triangles. From link three above, we see that to determine that two triangles are similiar, all we need to know, is that they have two angles which are identical in measure. Actually, two triangles will be similiar if and only if all three of the angles of one are congruent to the angles of the other. The reason for this, is because the sum of the interior angles of any triangle is equal to two right angles (180 degrees), and that was proven by Euclid here: Euclid, Book 1, proposition 32 So in other words, if at least two angles of one triangle correspond to two angles of another, for example ABC = DEF BCA=EFD then CAB=FDE because ABC+BCA+CAB=DEF+EFD+FDE (from Euclid) so using algebra, together with logic, Assume ABC = DEF and that BCA=EFD. This is the scope of the first and only assumption. From Euclid: ABC+BCA+CAB=DEF+EFD+FDE Subtract equals from equals to obtain: ABC+BCA+CAB - ABC = DEF+EFD+FDE-DEF BCA+CAB = EFD+FDE Subtract equals from equals to obtain: BCA+CAB -BCA = EFD+FDE-EFD CAB = FDE Now, close the scope of the first assumption, as follows: If ABC = DEF & BCA=EFD then CAB = FDE Which was to be proven. QED So, in other words, if at least two angles of one triangle are congruent to two angles of another triangle, then all three angles of one triangle, are congruent to all three angles of the other. Where we have used a result obtained by Euclid, over two thousand years ago. Ok, here is what wolfram has to say about similarity: Similarity from mathworld Ok, three non-colliner points in space suffice to construct a triangle. Suppose we are given two triangles: [math] \triangle ABC [/math] [math] \triangle EFG [/math] If the location in space, of their centers of inertia are identical, and the triangles are coplanar, it may be possible to make the sides of the triangles overlap by a rotation. If it is not possible to do this after 2pi degrees have passed, in time, then first what needs to be done, is for one of the triangles to be rotated in a third dimension, relative to the plane they are both initially in, and that rotation needs to be, umm, 180 degrees. Then after one of the triangles has been rotated out of the plane it was initally, but its center of inertia remaining at the same place, when it is again in the original plane, it will be possible to rotate it, so that its sides will coincide with the other triangle. Since the triangles are not really made out of material, it's incorrect to refer to them as having a center of inertia, instead it should just be called the center of the triangle. But the point is, that there is a condition for the two triangles to become one triangle, after a finite number of translations and rotations have been performed, and that condition is this: Again the given triangles are: [math] \triangle ABC [/math] [math] \triangle EFG [/math] Now, if the two triangles are to even have a chance of perfectly coinciding, then at least one of the angles of one of the triangles must be identical in measure, to at least one of the angles of the other triangle. Suppose we know for certain that angle ABC is equal to angle EFG. Let me do this from scratch. Fact 1: Given any three points in space, exactly one triangle can be constructed. Now, in order to be consistent with modern mathematical language, a triangle is just the straight lines which make up the three sides, and does not include the interior of the triangle. So [math] \triangle ABC \neq \triangle EFG [/math] if and only if there is a point on one, which is not on the other. We can now use naive set theory to say this too. Each side of a triangle, has a certain amount of length, which could be measured by a ruler. You have to bring the ruler up to the side of the triangle to actually make the measurement. We need a way to refer to the sides of a given triangle. There is already a method for doing this, using Latex. We have: [math] \overline {AB} \equiv \text{side AB} [/math] [math] \overline {AC} \equiv \text{side AC} [/math] [math] \overline {BC} \equiv \text{side BC} [/math] So now, any one of the three sides of a triangle is a set of points in space. And now we can use naive set theory to discuss the sides of our triangle. So consider side AB. The straight line segment from point A to point B, consists of the point A, the point B, and all the points in between. Now, to the definition of 'between'. The mathematical concept of 'between' was first seriously focused upon by David Hilbert, and later by Pasch. Here is a good link about it Hilbert, Pasch, and Betweeness of points The truth of the matter is, I have David Hilbert's book, and I started to read it, and I didn't like his style of proof. Right now, i don't want to worry about the history of the study of 'betweeness', later yes, now no. Let a straight line have been constructed from point A, to point B. Without looking, I am going to see if i can axiomatize 'betweeness' on my own... A randomly chosen point C in space, will either be between the two given points A,B... or not. Suppose not, what must be true? The answer is rather simple. Let d(A,C) denote the distance from A to C. Let d(B,C) denote the distance from B to C. Let d(A,B) denote the distance from A to B. Now consider the sum of distances, d(A,C)+d(B,C). Let us assume that the sum is an element of the real number system, i.e. [math] d(A,C)+d(B,C) \in \mathbb{R} [/math] And also: [math] d(A,B) \in \mathbb{R} [/math] Now, by the trichotomy property of the real number system, one and only one of the following three statements is true: [math] \text{1.} \ d(A,C)+d(B,C) = d(A,B) [/math] [math] \text{2.} \ d(A,C)+d(B,C) > d(A,B) [/math] [math] \text{3.} \ d(A,C)+d(B,C) < d(A,B) [/math] The shortest path from one point A in space, to another point B in space, is the straight line path of Euclidean Geometry. Now, I have supposed that a randomly chosen point C in space is not between A and B. What that means, is that the point C is not on the shortest path from A to B. But, the union of points comprising the line segment AC, and the line segment CB, is a path from A to B. Since the point C is not on the shortest path from A to B, it therefore is an element of a longer path from A to B. This is what happens when you suppose that the randomly chosen point C, is not an element of the straight line from A to B. Therefore: [math] \text{not (C is between A,B)} \Rightarrow d(A,C)+d(B,C) > d(A,B)[/math] And the previous statement is logically equivalent to the contrapositive, and so has the same truth value as its contrapositive. Its contrapositive is: [math] \neg [d(A,C)+d(B,C) > d(A,B)] \Rightarrow \text{C is between A,B} [/math] So now, we need a statement of the form "If C is between A, B then such and such" At which point, betweeness of points can be axiomatized using the distance axioms. Now, if C is between A,B, then it is an element of the straight line segment from A to B, and therefore C would be on the shortest of all possible paths from A to B. Suppose that [math] \neg [d(A,C)+d(B,C) > d(A,B)] [/math] Therefore, by the trichotomy property of the real number system, one of the following statements must be true: [math] \text{1.} \ d(A,C)+d(B,C) = d(A,B) [/math] [math] \text{2.} \ d(A,C)+d(B,C) < d(A,B) [/math] Now, it cannot be statement two which is true, for then C would lie on a path from A to B which is shorter than the shortest of all possible paths. Therefore, statement 1 would be true, under the given supposition. Therefore, the following statement is true: [math] \neg [d(A,C)+d(B,C) > d(A,B)] \Rightarrow \ d(A,C)+d(B,C) = d(A,B) [/math] Where we have closed the scope of the assumption. So now, let us connect the verbal form of the statement that C is between A,B, to a precise mathematical statement of the same fact. We can accomplish this as follows. Stipulate that the following statement is true: [math] \text{C is between A,B} \Rightarrow \neg [d(A,C)+d(B,C) > d(A,B)] [/math] Using hypothetical syllogism, it now must follow, that the following statement is true: [math] \text{C is between A,B} \Rightarrow d(A,C)+d(B,C) = d(A,B) [/math] Now, in this example, we picked one out of infinitely many points in space which could have been chosen at random. So, we can use universal generalization and write: [math] \forall C \in \mathbb{R}^3 [\text{C is between A,B} \Rightarrow d(A,C)+d(B,C) = d(A,B)][/math] Translation: For any point C in three dimensional Euclidean space, if C is in between points A and B, then the distance from point A to point C, plus the distance from point B to point C is equal to the distance from point A to point B. So again, let us continue with our randomly chosen point C. In order to have a logical definition of "C is between A,B" we need the converse of [math] \text{C is between A,B} \Rightarrow d(A,C)+d(B,C) = d(A,B)[/math] Now, if A=B, then it is impossible that A>B, for any quantities A,B. Therefore: [math] d(A,C)+d(B,C) = d(A,B) \Rightarrow \neg (d(A,C)+d(B,C) > d(A,B) ) [/math] We can now use hypothetical syllogism, to infer that the following statement is true: [math] d(A,C)+d(B,C) = d(A,B) \Rightarrow \text{C is between A,B} [/math] And this statement is true, for any randomly chosen point C. Therefore: [math] \forall C \in \mathbb{R}^3[d(A,C)+d(B,C) = d(A,B) \Rightarrow \text{C is between A,B}] [/math] So we have now proven that the following two statements are true: [math] \forall C \in \mathbb{R}^3 [\text{C is between A,B} \Rightarrow d(A,C)+d(B,C) = d(A,B)][/math] [math] \forall C \in \mathbb{R}^3[d(A,C)+d(B,C) = d(A,B) \Rightarrow \text{C is between A,B}] [/math] And by saying they are true, we mean that they must be true simultaneously, therefore the following statement must be true: [math] \forall C \in \mathbb{R}^3 [\text{C is between A,B} \Rightarrow d(A,C)+d(B,C) = d(A,B)] [/math] AND [math] \forall C \in \mathbb{R}^3[d(A,C)+d(B,C) = d(A,B) \Rightarrow \text{C is between A,B}] [/math]. So let D be a randomly chosen point in a frame where the straight line defined by the given two points is at rest. Therefore, the following statement is true: [math] \text{D is between A,B} \Rightarrow d(A,D)+d(B,D) = d(A,B)[/math] AND [math] d(A,D)+d(B,D) = d(A,B) \Rightarrow \text{D is between A,B}[/math]. Therefore, using the definition of the logical connective "if and only if" the following statement is true: [math] \text{D is between A,B} \Leftrightarrow d(A,D)+d(B,D) = d(A,B)[/math] And it is true for any randomly chosen point, not just D. Therefore: [math] \forall C \in \mathbb{R}^3[\text{C is between A,B} \Leftrightarrow d(A,C)+d(B,C) = d(A,B)][/math] Now, the proof was carried out under the assumption that point A, and point B were different locations in space. If one of the points can move relative to the other, that will complicate the logic. But, if we are to permit ourselves the use of frames which can move relative to one another, then we need to allow for the possibility, that one of the given points can move relative to the other. So we can summarize what we know thus far: [math] \forall A,B \in \mathbb{R}^3: \text{if not(A=B) then} [/math] [math] \forall C \in \mathbb{R}^3[\text{C is between A,B} \Leftrightarrow d(A,C)+d(B,C) = d(A,B)][/math] Now in the general theory of relativity, points in a single frame can move relative to one another, whence some wonder whether or not space can expand. And also, we want to be able to utilize what we learn in the best of all possible manners, so that we might want to allow point A to represent a point permanently at rest in one frame, say frame S, and let point B denote a point permanently at rest in another frame, say S`, and have the two frames be in relative motion, with relative speed v. If we permit this, then the coordinates of point B in frame S are functions of time, and also, the coordinates of point A in frame S` are functions of time. All that was told to us originally, was that the points A,B were given, and we were to axiomatize 'betweeness' without reading the works of David Hilbert and Moritz Pasch. For right now, only consider the case where A and B are permanently at rest relative to one another. Now, either A=B or not (A=B). Suppose that A=B. Therefore: [math] \forall C \in \mathbb{R}^3[\text{C is between A,A} \Leftrightarrow d(A,C)+d(A,C) = d(A,A)][/math] Now, we have to decide whether or not d(x,x)=0 for any x should be adopted as a distance axiom. Here are the standard distance axioms, of a Euclidean metric: Distance Axioms The only problem I have with permitting the distance from a point to itself to be equal to zero, is that it conflicts with the operational definition of a ruler. A ruler is something which must have a length greater than zero. So if you hold that distance is what rulers measure, then there is a conflict. This can be made more analytical, using the special theory of relativity, in combination with first order logic, as follows: Suppose we wish to verify whether or not the Lorentz contraction formula is true. First, let us postulate a set of rulers. We have already postulated a set of points. Suppose we have two rulers, which when at rest relative to one another have the same length, see below: A--------------------B A`-------------------B` Let the frame which ruler AB is permanently at rest in, be called frame S. Let the frame which ruler A`B` is permanently at rest in, be called frame S`. Let the origin of S be the center of inertia of ruler AB, and let the origin of frame S` be the center of inertia of ruler A`B`. Now, according to the theory of special relativity, these two rulers should always have the same length when at rest with respect to each other, disregarding tiny perturbations in length, due to the nonzero temperature of the rulers. Let us assume the following formula is a true statement in frame S. [math] L = L_0 \sqrt{1-v^2/c^2} [/math] Stipulate that frame S is an inertial reference frame. Thus, ruler AB will remain at rest in other inertial reference frames, or in straight line motion at a constant speed, until such time as an external force acts upon ruler AB. Because the ruler is permanently at rest in frame S, if an external force acts to accelerate ruler AB, frame S will accelerate along with the ruler, however, at that moment in time, the truth value of the statement "S is an inertial reference frame" will switch from true to false. For the duration of the 'event' to be analyzed here, let it be stipulated that frame S is an inertial reference frame. Thus, all of Newton's laws of motion are true statements in frame S. Disregarding how S` got into motion relative to S, let it be the case that S` is moving relative to S at relative speed v. For the sake of definiteness, let the X axis of frame S be parallel to ruler AB, and let the X` axis of frame S` be parallel to ruler A`B`, and let the positive x axis of both rulers point from left to right. ---------------------------------------------------------------- Let me make a long story short... I'm not sure that we should define distance from a point to itself as zero, i think it might cause contradiction.
Johnny5 Posted April 12, 2005 Author Posted April 12, 2005 At some point' date=' you're going to have to start defining things in terms of other things. In my opinion, relations are rather a nice, neat and concise way of doing this and follow pretty much from set theory. [/quote'] Ok, I just went back and read this more carefully. Here, you elude to the fact that you can define 'relation' using basic concepts of set theory. I would like to see how this is done. Thank you
Johnny5 Posted April 14, 2005 Author Posted April 14, 2005 Yesterday, I started thinking about the distance axioms, that's where I will pick things up today, nice and fresh. Here are the distance Axioms: [math] \text{Axiom I: }\forall A \in \mathbb{R}^3[ d(A,A) = 0] [/math] [math] \text{Axiom II: }\forall A \in \mathbb{R}^3 \ \forall B \in \mathbb{R}^3 [\neg(A=B) \Rightarrow d(A,B) > 0] [/math] [math] \text{Axiom III: }\forall A \in \mathbb{R}^3 \ \forall B \in \mathbb{R}^3 [d(A,B) = d(B,A)] [/math] [math] \text{Axiom IV: }\forall A \in \mathbb{R}^3 \ \forall B \in \mathbb{R}^3 [d(A,B) + d(B,C) \geq d(AC)] [/math] Axiom four is called the triangle inequality. It's logical formulation comes to you by merely looking at a triangle in an affine plane. An affine plane, is a fancy way to say, the triangle isn't in a coordinate plane, there are no coordinates assigned to axes. The symbol [math] \mathbb{R}^3 [/math] stands for Euclidean three dimensional space. An element of R^3 is called a point. So, for example, to translate axiom 1 into English, we have to say, "For any point A in Euclidean space, the distance from the point to itself is equal to zero." The notation, is actually a compactified form for the following more complicated expression: [math] \text{Axiom I: }\forall X[X \in \mathbb{R}^3 \Rightarrow d(X,X) = 0] [/math] Translation: For any symbol X, if X is an element of Euclidean space, then the distance from X to X is equal to zero. If you additionally know that elements of Euclidean space are called points, then you can translate it this way: Translation: For any symbol X, if X denotes a point, then the distance from X to X is equal to zero. Now, somewhere in the universe is the center of mass of the universe, and this location is at rest in at least one an inertial reference frame. Let us choose one out of the many, and call it S. Now, let S` denote a frame which is in motion through S. What I want to do, is make sure that axiom one is ok. Pick two points X,Y, at random in frame S. It is impossible for these two points to be moving relative to one another. This distance between them is denoted by: [math] d(X,Y) [/math] Now here is the thought which is bothering me. Suppose we stipulate that distance is something which must be measured by a ruler. No ruler can have a length of zero, and so a distance of zero makes no sense. Or, to put this another way, if a line can shrink to a point, then there would be a contradiction of your domains of discourse. In other words, you would have something like this going on... No element of the set of lines, is an element of the set of points, and no element of the set of points, is an element of the set of lines. And if you entertain as a possibility that the following statement is true: [math] L = L_0 \sqrt{1-v^2/c^2} \wedge \exists S [ v=c] [/math] You can reach an explicit contradiction. The first conjuct of the compound statement above, is just the Lorentz contraction formula of special relativity theory. Suppose that you have that as true, and that you entertain as possible, that there is at least one inertial reference frame S, in which the speed of the center of inertia of something is v in frame S, and also v=c in frame S, then what happens is this: Using the formula for length as a function of time, you will see that in frame S, the object which is moving is a point, and not a straight line. And no point is a line, yet the object is a line in other frames, in which it has its proper length. So that the following contradiction is explicit: [math] A \in \mathbb{R}^3 \wedge \neg (A \in \mathbb{R}^3 ) [/math] From which you conclude the following: [math] L = L_0 \sqrt{1-v^2/c^2} \Rightarrow \neg \exists S [ v=c] [/math] But of course if SR is false, you don't have to worry about the previous argument. Moving on... The key thing which defines a straight line, is that it is the shortest of all possible paths from one point X, to another point Y, in a single frame. So now, to any two points in a single frame, there is exactly one real number chosen which corresponds to the distance between the two points. The concept of shortest path is indispensable in physics, and I want to make sure there is no contradiction introduced into the logic of the distance axioms, when they interact with "shortest path" axioms which we may choose to introduce alongside the distance axioms. Distance is applied to "path length" Suppose something is moving in an inertial reference frame S, with constant speed v, and path: straight line. By definition, the speed of the object in this frame is defined to be distance travelled, divided by time of travel. [math] v_s = \frac{D}{\Delta t} [/math]
matt grime Posted April 14, 2005 Posted April 14, 2005 ANd now you'er doing metric spaces? ALmost no vector spaces have a metric. You can define, rigorously, relations and then fucntions using set theory if you want to, but why bother? It doesn't aid you in understanding what functions do, or proving things about functions. Define it in words.
Johnny5 Posted April 14, 2005 Author Posted April 14, 2005 You have to do things using both words, and math/logic symbols. Words are for communication to those who speak your language, but the symbols allow you to communicate to others who do not. (also, and most importantly, logical symbols allow you to reason correctly) Right now I'm addressing an issue which bothered me since the very first time I saw it, which was well over ten years ago. Why was the distance from a point to itself, defined at all? There are an infinite number of paths from a point to itself, and no one of them is shorter than all the rest. It is only in the case of two distinct points, that "shortest path" has any meaning. It's a minor point perhaps, but once I solve it to my satisfaction, I can move onto something else, with better approach. I guess, I am just trying to decide whether I am going to adopt axiom 1 or not, above. This one: [math] \forall A \in \mathbb{R}^3 [d(A,A) = 0] [/math] The others are fine. I have already accepted the truth of the triangle inequality, and distance between two points has nothing to do with direction, its positive, and hence I have no problem with d(A,B)=d(B,A)>0 for any points A,B such that not(A=B). If there is no reason not to adopt it, then i will, but I don't want it to screw anything else up, and it might, since frames can move relative to one another. I just want to check so that I can be sure. Perhaps it will help you understand more, if I explain to you what I am trying to do. Take a given Euclidean frame of reference. It has a euclidean metric, given by the Pythagorean theorem. Let me explain this. All I mean by Euclidean frame of reference, is really two things... First that there are three mutually perpendicular coordinate axes. That constitutes a frame, and the distance between any two points in a single frame S, is given by the generalized Pythagorean theorem. So, I approach the problem, as always, using first order binary logic. I ask myself, what my domains of discourse are. In the case that there is only one domain of discourse, things are rather easy. In the case of multiple domains of discourse, I know the symbolism is going to increase in complexity, but necessarily so. Now, there are already firmly established symbols in place for me to use. The most commonly used symbols are: [math] \forall \equiv \text{For any}[/math] [math] \exists \equiv \text{There is at least one} [/math] [math] \neg \equiv \text{not} [/math] [math] \wedge \equiv \text{and} [/math] [math] \vee \equiv \text{or} [/math] [math] \Rightarrow \equiv \text{if-then} [/math] [math] \Leftrightarrow \equiv \text{if and only if} [/math] Now, there is some redundancy in the use of all of them, but that is irrelevent. They are all studied, so all can be known. Now, the logical operators above, have precise logical definitions, in terms of the two truth values true,false, and their usage must be in accordance with those definitions. But those symbols are defined on one and only one set, which I will denote as [math] \mathbb{S} [/math] So when you teach logic, one of your domains of discourse is [math] \mathbb{S} [/math] An element of S is usually called a statement, but also sometimes called a proposition. I pretty much stick with the term statement. The criterion for X to be an element of [math] \mathbb{S} [/math], is that it be either true or false, at any moment in time. And this idea can be formulated, using set-builder notation, as follows: [math] \mathbb{S} \equiv \mathcal{f} X: |X|=0 \ \text{XOR} \ |X|=1 \mathcal{g} [/math] Translation: [math] \mathbb{S} [/math] is the set of all X, and only X, such that, the truth value of X is true (XOR) the truth value of X is false. The meaning of XOR comes from its truth functional definition: AB A XOR B 00 0 01 1 10 1 11 0 Thus, the statement denoted by (A XOR B) is true when A is true, or B is true, but not both. Now I am accustomed to using temporal logic, which means that the truth value of some statements can vary in time. This actually matters in what I am doing, because there are frames which can go from being inertial frames, to non-inertial frames, in an ultimate reality sense. I dont really know how else to say it. In other words some frame S, can go from being an inertial reference frame, to a non-inertial reference frame, in what I'm working on. So that I have to deal with statements whose truth value can vary in time. It's harder than it sounds. It's not hard at really. In fact it makes more sense, then thinking only about statements whose truth value is constant in time. At any rate, astronomers believe that space is expanding. So whether they know it or not, they need to analyze that idea using temporal logic, since the distance between two points in one frame is a function of time. Contrast that with the Euclidean space with which everyone is familiar. In Euclidean space, the distance between two points in one frame, cannot vary in time. So if I say the distance between point X, and point Z, in frame S is 3 meters, that statement is true forever. Contrast that with the idea that space can stretch. Using that idea, the statement that the distance between X, and Z is not constant in time. So that the statement can go from being true in frame S, to being false in frame S. The only way to succeed at logical analysis, using this kind of complexity, is to know binary logic thoroughly. So while it may seem like I am stuck on one stupidly simple statement, which happens to be: [math] \text{for any reference frame S, and any point A in S} [d(A,A) = 0] [/math] I can't just casually mark it off as true forever, without first checking out how it impacts the massive logical system which I've been steadily working on. If you didn't understand, it doesn't matter, eventually i will figure out what I want to know, and then that will be the end of it. Regards PS: I can be more precise. Suppose that some ruler is in its own rest frame S, permanently, no matter whether it is at rest in some other frame S`, or moving at a constant speed in S`, or accelerating in S`. Then if the theory of special relativity is correct, its length is a frame dependent quantity. Here is the formula for its length: [math] L = L_0 \sqrt{1-v^2/c^2} [/math] In its own rest frame, its speed is 0. So using that formula, its length in its own rest frame is: [math] L = L_0 [/math] Now, a ruler is that which can measure distance. Suppose that the center of inertia of this ruler is moving through the coordinates of inertial reference frame S` at speed c. Then using the formula above, the length of the ruler in this frame is given by: [math] L = L_0 \sqrt{1-c^2/c^2} = 0[/math] And this would certainly cause some kind of contradiction, which could be made explicit if I wanted. Mathematicians think nothing of setting the distance from a point to itself equal to zero. But if distance is operationally defined, then they cannot do this. The only way I would easily agree to it, is if they proved that it reduced proof complexity. What's bugging me is strings. Strings are things with length. Now, no matter how much you twist a string, its length is pretty much constant. But now focus on the ends of the string. If I ask what the distance is between them, you can take a ruler and measure it, no thought necessary. You don't need the Pythagorean theorem you just need a ruler. But the key to straight line, is that of all the strings that could go from one point to another, the straight line is the shortest. The other lines have a 'distance' just not a 'straight' distance. BUt the Euclidean metric is the one which represents a minimization of "all possible string lengths from one point to another different point." So if distance is associated with length, then allowing distances of zero, might lead to a contradiction to something else.
Johnny5 Posted April 14, 2005 Author Posted April 14, 2005 In this part here I'm gonna run through the vector space axioms. Actually, I think I'm gonna turn them into theorems. Let [math] \vec u [/math] denote an arbitrary vector. Let [math] \vec v [/math] denote an arbitrary vector. Definition: Vector addition is defined geometrically. Theorem: [math] \vec u + \vec v = \vec v + \vec u [/math] Let S denote an arbitrary three dimensional Euclidean frame. Therefore, the vectors in the theorem can be expressed using their coordinates in the frame as follows: [math] \vec u = u1 \hat e_1 + u2 \hat e_2 + u3 \hat e_3 [/math] [math] \vec v = v1 \hat e_1 + v2 \hat e_2 + v3 \hat e_3 [/math] Or in a more compactified notation: [math] \vec u = u1 \hat e_1 + u2 \hat e_2 + u3 \hat e_3 = \sum_{n=1}^{n=3} u_n \hat e_n [/math] [math] \vec v = v1 \hat e_1 + v2 \hat e_2 + v3 \hat e_3 = \sum_{n=1}^{n=3} v_n \hat e_n [/math] Or, using Einstein summation convention, we can write: [math] \vec u = \sum_{n=1}^{n=3} u_n \hat e_n = u^n \hat e_n[/math] [math] \vec v = \sum_{n=1}^{n=3} v_n \hat e_n = v^n \hat e_n[/math]
matt grime Posted April 15, 2005 Posted April 15, 2005 Vector addition isn't *defined* geometrically, and it is part of the axioms that the vector space is commutative under addition, so if iti s a theorem it is a trivial one. Moreover, it is an axiom for a metric space that d(x,y)=0 if and only if x=y. And it is clear from the euclidean metric on R^n that this is true. It has nothing to do with lengths of string, or twisting them. If d(x,y) = 0 for some x not equal to y, or d(x,x) isn't zero it isn't a metric, that is all. As it happens, the universe is more properly modelled with hyperbolic geometry. But note the emphasisi here onthe world modelled. You apparently seem to think the universe IS euclidean space. If you don't think it is a suitable model that is different. Nor do I see why you think there are an infinite number of paths from a point to itself no one of which is any shorter than any other, and why this isn't true of two other points. The number of paths isn't important anyway, ony the inf of all the lengths. There are an infinite number of paths all of the same shortest lengths between antipodal points on the sphere. What exactly do you mean by path anyway? Do you in any book see anything that specifies uniqueness of paths? Oh, and you don't need to write out the (usual) definitions for me, I already know them. When I say "what do you think a path is", for example, I am pointing out that you don't appear to be using standard fixed ideas.
Johnny5 Posted April 15, 2005 Author Posted April 15, 2005 Vector addition isn't *defined* geometrically' date=' and it is part of the axioms that the vector space is commutative under addition, so if iti s a theorem it is a trivial one.[/quote'] Newton used a parallelogram law for vector addition in principia. You say that 'addition' of two vectors is an axiom, not a definition. I don't see that. If you walk 10 meters north, then 10 meters due east, you can make a vector triangle, and talk about the resulant vector, which is the hypotenuse of a right triangle, both of whose legs are 10 meters long. but it is a vector not a scalar, and its direction is from the starting point of your journey to the ending point, along some... path. The length of hypotenuse, by Pythagorean theorem, is: [math] h = \sqrt{10^2+10^2} = \sqrt{100+100} = \sqrt{200}[/math] [math] 13*13=(13)^2 = 169 [/math] [math] 14*14=(14)^2 = 196 [/math] [math] 15*15=(15)^2 = 225 [/math] Hence: [math] 14= \sqrt{196} < \sqrt{200}<\sqrt{225}=15[/math] Let us use 14.1 as a first approximation to the square root of 200. [math] 14.1*14.1=(14.1)^2 = 198.81 [/math] seeing that this is less than 200, we need to guess higher. Let us guess 14.2 instead. [math] 14.2*14.2=(14.2)^2 = 201.64 [/math] Thus, we have guessed too high, when before we had guessed to low. But at least we know that the value of square root of 200 is trapped between 14.1, and 14.2. We can write this as follows: [math] 14.1 < \sqrt{200}<14.2 [/math] Now, there is a very very ancient way to figure out the square root of a number, called the Babylonian formula.
matt grime Posted April 15, 2005 Posted April 15, 2005 Hmm, if you want to do maths as if you're a physicist who's been dead for the best part of 300 years good luck to you. Vectors are not, in modern maths, "things that have length and direction", they are elements of a vector space, things that satisfy certain axioms. A model of which is useful for describing what are commonly called "vectors" such as force adn displacement.
matt grime Posted April 15, 2005 Posted April 15, 2005 I just posted a rpely pointing out that doign evctor spaces accoridng to someone how's been dead for 280 years isn't a good idea. only it didn't appear. plus your triangle walking experiment doesn't accout for the fact that the earth is curved... learn to differentiate between a model and the reality it is modelling
Johnny5 Posted April 16, 2005 Author Posted April 16, 2005 I just posted a rpely pointing out that doign evctor spaces accoridng to someone how's been dead for 280 years isn't a good idea. only it didn't appear. plus your triangle walking experiment doesn't accout for the fact that the earth is curved... learn to differentiate between a model and the reality it is modelling What are you talking about Matt, are you talking about Riemannian geometry? Regards
matt grime Posted April 17, 2005 Posted April 17, 2005 Erm, is the earth's surface a) euclidean, b) riemannian (hyperbolic) or c) spherical, or d) none of the above necessarily, though eulcidean geometry provdes a reasonably good approximation/model for local questions, and spherical for global questions? What about the observation that thinking of modern vector spaces (kernels, linear maps, spectral theory), set theory (a function defined as a subset of AxB, domains, codomains), and propositonal logic in terms of Newton, who died many years before any of those systems were created, formalized, studied and presented in the manner of the text books you are reading is probably not going to be a great help? A vector is an element of a vector space, V. A linear map is a function from V to V (or some other vector space) satisfying certain properties. A function is a way of associating to each element in the domain a unique element in the codomain. The propositional form for this uniqueness is, if you'll forgive the lack of formality, something like: for all x in the domain, there is a y in the codomain such that f(x)=y [or whatever form you prefer your functions to take], and such that for all z in the codomain f(x)=z implies z=y.
Johnny5 Posted April 17, 2005 Author Posted April 17, 2005 for all x in the domain' date=' there is a y in the codomain such that f(x)=y [or whatever form you prefer your functions to take'], and such that for all z in the codomain f(x)=z implies z=y. Just want to think about this a bit. Let A denote the domain. Let B denote the codomain. [math] \forall x \in A \exists y \in B [ y=f(x) \ \& \ \forall z \in B (f(x)=z \Rightarrow z=y) ] [/math] Using the uniquness notation we have this: [math] \forall x \in A \exists ! y \in B [ y=f(x)] [/math] That is so much easier to read. Translation: Given any x an element of the domain A, there is one and only one element in the range B, such that y is equal to function of x. I prefer using the notation for ordered pairs actually Matt, because then there is some reference made to the fact that f is a set. Oh by the way, the surface of the earth isn't any exact mathematical structure. It can be modelled as an oblate spheriod well enough though. So the surface isn't Riemannian, or Euclidean, in any precisely expressible manner. So my answer is none of the above, not necessarily. Space isn't a material object, so space is Euclidean. Local, nonlocal, its all Euclidean. The curvature of a vacuum is literally zero. I wouldn't mind talkin about vector spaces with you though. Maybe it could help. So a vector [math] \vec v [/math] is an element of a vector space [math] \mathbb{V} [/math]. Real space is Euclidean three dimensional space, which is normally represented as [math] \mathbb{R}^3 [/math] So let an arbitrary vector [math] \vec v [/math] be an element of [math] \mathbb{R}^3 [/math]. In order to discuss linear mapping, we have to first specify which vector space is the domain, and which vector space is the codomain. How about start off with Euclidean three space as the domain, and Euclidean three space as the codomain, and talk about linear maps from R^3 to R^3, for starters. So give me a simple example of a linear mapping from R^3 to R^3.
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now