My very short answer: A value of a random variable.
Let's say you receive a symbol "1". If this is the only possible symbol the fact that you received it does not give you information. But if this symbol is one of two possible, "0" and "1", then the reception of symbol "1" may contain information. So having more that one symbol is a requirement, but not sufficient. Lets say you receive the pattern "111111...". The probability of the symbol "1" is 1. Again there is no information. But if random sequences are allowed, for example "00", "01", "10", "11" then we may use these sequences to represent information. So conceptually information can be seen as a value form of a random variable.
The above is an attempt at an extremely short introduction to information theory, which is tied to discrete probability theory. Most important early contributor was Claude E. Shannon and his paper “A Mathematical Theory of Communication”, dealing quantitatively with the concept of “information”. Shannons concepts and the mathematics he used to describe information and to measure information content is a remarkable contribution. I believe it's tricky to find any areas of IT where his work does not contribute.
Wikipedia* has links to some concepts related to your question. Feel free to ask additional questions.
*) https://en.wikipedia.org/wiki/A_Mathematical_Theory_of_Communication
For an early predecessor of Shannon, working on sinus signals and frequencies, see Hartley: https://en.wikipedia.org/wiki/Ralph_Hartley