How to connect Astrology with Predictive Models, Variables, Data Types and Data Science

Statistical Constructs

The first part of this series explains the purpose and need for statistical perspectives on astrology. If you haven’t read it, I recommend you do so first to understand the context of this second part. This is a rather long article, so please set aside some time and read it patiently.

A Question for You

I think it’s fitting to start the second part of this article series with an example. Let’s say you’re applying for a job and you’re at a job interview. You’re given a measuring tape and are asked to find the approximate weight of a few middle-aged men and women of average height and build who are sitting in front of you.

You’re allowed to ask the interviewer for more questions, information, or other data if needed. What would you do? What kind of questions, data, or information would you ask for to quickly get the answer to the question you’ve been given?

Pay close attention to the question. You’re given a measuring tape, which helps measure the distance between two points. But you’re being asked to estimate the weight of some people. Take a pause here, think about your answer or approach, and jot it down. By the time you finish reading the rest of this article, you may find multiple paths to answer the question I’ve posed.

You can relate this example to astrology. In astrology, you look at the planetary position data, but the predictions you give are about life events, aren’t they?

What We’ll Cover in This Part

In this article, we’ll look at some basic statistical concepts. The goal of this part is to introduce you to what variables are in statistics and how equations are derived.

This article might seem challenging for those who have no basic familiarity with statistics and a bit superficial for those who have studied statistics. This article is written with the former audience in mind. I am deliberately keeping it a bit superficial, as I realize that deep explanations of statistical concepts might alienate this type of reader.

This article is an important foundation for understanding the later parts of this series. I have tried to condense a vast amount of statistical knowledge—which can take years to study and understand—into one article. I am writing this with the awareness that if you are not interested, there is a good chance you might not understand it.

This part is a crucial introduction to the types of statistical constructs found in astrology. Therefore, I request that readers take some time to reread and understand any parts that they find difficult.

This article is closely related to my previous article series on how astrological rules are constructed. For your convenience, I have provided the links to that series below. You can read them when you have time to get more in-depth information.

Statistical Mathematical Models – An Introduction

We know that equations mathematically summarize the relationship between two or more things. These equations can be exact or approximate.

Y=f(Xi​)

Where i=1 to n

This is a general notation used to indicate a relationship between two things. Here, Y is the outcome, and X represents its factors or variables. The notation i=1 to n indicates that there could be more than one factor.

You may have studied about exact relationships used in fields like physics and chemistry, which deal with inanimate objects. For example, you know the equation for water.

Water = f(Hydrogen, Oxygen); n = 2

H2​O=f(H,O)

2 H+1 O2​=1 H2​O

Water consists of 2 molecules of hydrogen and 1 molecule of oxygen—this is a fixed, definite equation. What we need to understand is the 2:1 ratio. In such equations, a certain outcome is determined by one or two inanimate factors, which makes the outcome certainty possible.

However, in equations dealing with living things, exact equations rarely work. Because the factors for an action can be more than one and not all factors can be identified, classified and measured, mathematical equations about living things are always to some extent approximate and bound to have a certain amount of error. For example, the factors that control a person’s blood pressure.

Furthermore, a fundamental property of all living things is a certain degree of randomness—they don’t always react in the exact same way to the same specific stimulus every time ceteris paribus.

This is why any equation calculated for any living being will always contain a small amount of error. For example, no two oranges over produced in the world can be exactly identical in every way!

We can represent such approximate equations as follows:

Y=f(Xi​)+ei​

i=1 to n

Here, e is the error term—the gap between the reality and the portion of outcome getting explained by our statistical model.

Statistical knowledge helps us estimate these approximate, unknown equations through data. Statistical/econometric models play a huge role in linking an outcome with its various factors through data, thereby deriving the approximate mathematical relationship between the two. These models are derived and validated through existing data.

Furthermore, in the higher levels of this construct (econometric models), detailed requirements are drawn for each of the three components of the model: the outcome (Y), its factors (Xi​), and the error (e) term. Advanced statistical models (econometric models) are derived and evaluated strictly within the boundaries of these first order and second order definitions and conditions.

These equations serve as the pearls of wisdom derived from the accumulated data relationships. If past events can be statistically validated, then these equations or formulas can be used to make predictions about another point with similar data that has not yet been observed.

Are Approximate Equations Good Enough in Astrology?

The most difficult thing in the world is to accurately understand all people at all times and to know what will happen to them. It would be a great achievement if we could find an approximate answer that is trustable enough.

Today, most of the products we use are marketed based on approximate assumptions about the users. Did you know that these approximate but evaluated assumptions are what drive many business decisions, such as life-saving medications, vehicles, footwear, insurance plans, credit cards, YouTube video recommendations, and Facebook advertisements?

On this basis, we should call astrology a major human discovery, a precursor to the statistical science. There is a strong basis to propose a astrology as a statistical data science: it attempts to establish a relationship between changing astronomical data and life events by determining a birth chart based on planetary changes in the sky and then predicting life events based on continuous changes in those planetary positions. In this view, astrological rules are the final product, the accumulated knowledge or wisdom of a specific astrological construct’s statistical model.

We all know that astrology is a complex mathematical discipline. However, if we look at it through the lens of data science, its constructs are clearly based on the very same components of modern data science, statistical principles, and equations.

Now, I will introduce you to some very basic statistical terms and then give a brief note on how statistical models are derived.

Some Basic Statistical Terms

Data → Information → Knowledge → Wisdom

Data can fundamentally be numerals, letters, or a combination of both (alphanumeric). When building the final statistical models, letter-based variables are converted into numerical variables.

Data is the basis of decisions. It is vast in volume. Isolated, unconnected data carries no meaning. Example: Height 175 cm. In isolation, this has no meaning.

Data evolves into information when they are connected. At this stage, data connects at least two data points. Example: A specific person’s height is 175 cm. Here, a specific person is connected to their height.

Information can be further enriched and well-connected to become knowledge in a specific field. The strategies used to reach and control decision outcomes, gained through the experience of that knowledge, can then evolve into wisdom.

Variable

A symbol that identifies a specific trait, object, or state whose data can change from person to person. Variables can be represented by different units. For example, a person’s height, weight, blood type, gender, language, and economic status are all individual variables that change from person to person.

Types of Variables

Variables can be classified into four types based on the kind of data they can hold. Their names and properties are given below.

Continuous variable: This type holds whole numbers and decimals. For example, when a person’s height is specified in centimeters, it can take on any continuous value. For instance, a person’s height can be stated as 170 centimeters and 25 millimeters (170.25 cm).

These variables can also be represented by negative numbers. For example, the depth of the sea from sea level, -23.20 meters, is a negative and continuous variable. Since the practice of using negative numbers was not common when astrology was discovered, I will only use positive data in my examples from now on.

We can apply the four mathematical operations—addition, subtraction, multiplication, and division—to this type of variable. We can even represent very precise values with this type of variable. This type of variable can be easily converted into the other three types we’ll discuss next. This type of variable has been used sparingly in traditional astrology, but their use is seen in many places in ancient Indian astrology.

Astrological example: Data related to degrees (e.g., planetary degrees).

Integer variable: This type of variable can be positive or negative but will only be whole numbers. They are also a specific count. We can also apply the four mathematical operations to this type of variable. However, we should consider this type to be slightly less precise than the continuous variable we discussed above. For example, the number 1.5 can be rounded to 1 or 2.

This type of construct is widely used in astrology and can be considered a major foundation of the field.

Astrological example: House number, nakshatra (star) number, etc.

Categorical variable: These can be represented as numbers or letters. In this type of variable, only the order of arrangement matters. The numerical data they contain cannot be directly used in mathematical equations. For example, let’s consider a variable for the economic status of the general public. We can represent this with the letters “poor,” “middle class,” and “rich.” We can also represent this letter-based data as ordered numbers.

For example, 1 could represent the poor, 2 the middle class, and 3 the rich. Or it could be the reverse. The number-based data is only used to represent the order. These numbers cannot be used directly in all types of mathematical operations. For example, poor (1) + middle class (2) does not equal rich (3)!

Astrological constructs are full of such categorical variables.

Astrological example: Relationships between planets: deep friendship, friendship, neutral, enemy, extreme enemy, etc.

Nominal variable: Finally, these nominal variables can be taken as a symbol that represents individual characteristics. For example, a person’s name, blood type, gender, etc., belong to this type of variable. There is no ordering or ranking, such as “higher” or “lower.” Various reasons or explanations can be given for their creation. These variables can only be considered in terms of their count. Otherwise, they cannot be used to perform major statistical calculations directly.

This type of variable has been used extensively in astrology.

Astrological example: Houses assigned to planets, planetary significations (karakas), exalted/debilitated signs, etc.

Binary variable (or Dummy variable): This is a special type. We can further analyze both categorical and nominal variables and convert each sub-value into a binary variable (0, 1). This technique is used to convert letter-based data into numbers.

Astrological example: Astrological constructs like male signs and female signs.

Relationships Between Variables

We saw that data needs to be connected to be useful as information and knowledge. To connect two variables that hold data, a change in one of them must cause a significant change in the other or be correlated with it. This relationship can be positive or negative.

That is, when one variable increases in value, the other variable we are correlating it with should either increase or decrease. The rate of this change confirms the strength of the relationship between the two variables. The distribution of data between two variables can take the following forms.

Positive Linear Relationship Distribution

In this example, as height increases, weight also increases. Or as weight increases, height also increases.

Negative Linear Relationship Distribution: This image shows that as the price increases, the quantity sold decreases. This is a negative relationship.

Non-linear Relationship Distribution In this image, the data for Variable 1 and Variable 2 are both distributed. However, the distribution is not in a straight line but follows a curved path.

No Relationship Distribution In the image below, there appears to be no relationship between Variable 1 and Variable 2. Therefore, based on a change in one, it is impossible to guess the trend or value of the other.

When we visualize the changing data points between two variables in a two-dimensional space, the two-dimensional relationship between them must form a linear distribution or some other kind of regular distribution.

If this is the case, we can use Regression Analysis to find the relationship between the two as a statistical equation. There are many different methods for finding this relationship, depending on the types of variables, using statistical techniques.


The Answer to My Question

Based on the statistical principles we’ve seen so far, let’s now look at how to find the answer to the question we started with.

I mentioned that there are many ways to answer my question. Here are a few. You can compare them to what you had in mind. If you thought of anything else, please feel free to share it in the comments.

Approach 1:

You can make an educated guess about each person’s weight based on your own height and waist circumference. The main problem with this method is that if the person in front of you is taller or bigger than you, your prediction will likely have a high degree of error. Your predictions for people of the opposite gender will also likely be wrong!

Approach 2:

You can ask the interviewer to tell you the average weight for men and women and use that as your answer for both genders. You will definitely make an error with this approach, too. However, when you give the same answer to multiple people, your total error will definitely be less than the answer from Approach 1. Therefore, this is a slightly better method.

Approach 3:

You can ask the interviewer if there are any equations or rules that describe the relationship between a person’s weight and the measurements of their body parts that can be taken with the measuring tape. You can also ask for statistical data about the reliability of those equations.

Based on the information you receive, you can choose the best equation from the many provided (formulas) and use it to calculate the weight based on the measurements of specific parts of the person in front of you. When you do this, the error between your answer and the actual weight will be less than the first two methods. Since this equation or rule or formula is completely based on data, the chances of individual errors and inconsistent answers are lower with this method.

Approach 4:

If you have extensive training in statistics and have access to model-building software and expertise in the field, you can ask the interviewer for the data used to derive the rules, in addition to the rules themselves. You can then deeply analyze that data to confirm if the existing equations are sufficient and accurate. If necessary, you may even be able to improve those estimators. This approach takes a bit more time, but the total error will be even lower than the previous three methods.

Of the four methods we’ve seen, this is the most superior, scientifically acceptable, and verifiable method. Depending on the time given to answer and the required accuracy of the results, you can choose Approach 2, 3, or 4.


A Statistical Comparison of Astrological Rules

Now, compare the statistical explanations we’ve seen so far with astrological rules and prediction techniques. You may be able to see the great connection between the two.

Today, all we have in astrology are the formulas and rules passed down to us by our ancestors. They are countless and are often used in isolation. We don’t have any validated data in astrology about which factors are the most important and which are secondary when multiple factors can cause a specific outcome. But it is my unwavering belief that these can be created over time with continuous effort.

I want to put a very important warning in front of you at this time. In statistical models, data about their reliability can also be obtained. The contribution of each factor to a specific outcome can also be estimated.

But in astrology, we don’t have anything like that today! For generations, we have been blindly relying on formulas that someone, somewhere, said. We are superficially carrying on astrology without the clarity that we can create superior formulas using the very same constructs that our ancestors used.

I will explain this statement in detail in the next part of this article, using the height and weight example we discussed earlier. The next part of this article is about building statistical models. In the next part, we will see how to build a statistical model based on a sample data set, what its products are, and what their use is. I conclude this second part here. If you have read this far, that in itself is a big deal. Thank you! 🙏


This content above is a Tamil to English translation using Google Gemini with minor updates. This is Tamil version of the essay, if you prefer to read it in Tamil.


We have published lot of long form videos in Tamil that explore the subject of astrology from a statistical perspective in our AIMLAstrology YouTube Channel. You may benefit immensely from visiting and learning from there as well.


Feel welcome to share your comments or feedback!

This site uses Akismet to reduce spam. Learn how your comment data is processed.

This Post Has One Comment