17.2: Projective model

The following picture illustrates the map (Pmapsto hat{P}) described in the previous section — if you take the picture on the left and apply the map (Pmapsto hat P), you get the picture on the right. The pictures are conformal and projective model of the hyperbolic plane respectively. The map (Pmapsto hat P) is a “translation” from one to another.

In the projective model things look different; some become simpler, other things become more complicated.


The h-lines in the projective model are chords of the absolute; more precisely, chords without its endpoints.

This observation can be used to transfer statements about lines and points from the Euclidean plane to the h-plane. As an example let us state a hyperbolic version of Pappus’ theorem for h-plane.

Theorem (PageIndex{1}) Hyperbolic Pappus' theorem

Assume that two triples of h-points (A), (B), (C), and (A'), (B'), (C') in the h-plane are h-collinear. Suppose that the h-points (X), (Y), and (Z) are defined by

(egin{aligned} X&=(BC')_hcap(B'C)_h, & Y&=(CA')_hcap(C'A)_h, & Z&=(AB')_h cap(A'B)_h.end{aligned})

Then the points (X), (Y), (Z) are h-collinear.

In the projective model, this statement follows immediately from the original Pappus’ theorem 15.6.2. The same can be done for Desargues’ theorem 15.6.1. The same argument shows that the construction of a tangent line with a ruler only described in Exercise 15.8.2 works in the h-plane as well.

On the other hand, note that it is not at all easy to prove this statement using the conformal model.

Circles and equidistants

The h-circles and equidistants in the projective model are certain type of ellipses and their open arcs.

It follows since the stereographic projection sends circles on the plane to circles on the unit sphere and the foot point projection of the circle back to the plane is an ellipse. (One may define ellipse as a foot point projection of a circle.)


Consider a pair of h-points (P) and (Q). Let (A) and (B) be the ideal point of the h-line in projective model; that is, (A) and (B) are the intersections of the Euclidean line ((PQ)) with the absolute.

Then by Lemma 17.1.1,

[PQ_h=dfrac{1}{2} cdot ln dfrac{AQcdot BP}{QBcdot PA}]

assuming the points (A, P, Q, B) appear on the line in the same order.


The angle measures in the projective model are very different from the Euclidean angles and it is hard to figure out by looking on the picture.(The idea described in the solution of Exercise 16.3.1 and in the sketch of proof of Theorem 19.4.1 can be used to construct many projective transformations of this type.) For example all the intersecting h-lines on the picture are perpendicular.

There are two useful exceptions:

  • If (O) is the center of the absolute, then [measuredangle_hAOB=measuredangle AOB.]

  • If (O) is the center of the absolute and (measuredangle OAB=pm fracpi2), then

(measuredangle_h OAB=measuredangle OAB=pm dfrac{pi}{2}.)

To find the angle measure in the projective model, you may apply a motion of the h-plane that moves the vertex of the angle to the center of the absolute; once it is done the hyperbolic and Euclidean angles have the same measure.


The motions of the h-plane in the conformal and projective models are relevant to inversive transformations and projective transformation in the same way. Namely:

  • Inversive transformations that preserve the h-plane describe motions of the h-plane in the conformal model.

  • Projective transformations that preserve h-plane describe motions of the h-plane in the projective model.1

The following exercise is a hyperbolic analog of Exercise 16.5.1. This is the first example of a statement that admits an easier proof using the projective model.

Exercise (PageIndex{1})

Let (P) and (Q) be the points in h-plane that lie on the same distance from the center of the absolute. Observe that in the projective model, h-midpoint of ([PQ]_h) coincides with the Euclidean midpoint of ([PQ]_h).

Conclude that if an h-triangle is inscribed in an h-circle, then its medians meet at one point.

Recall that an h-triangle might be also inscribed in a horocycle or an equidistant. Think how to prove the statement in this case.


The observation follows since the reflection across the perpendicular bisector of ([PQ]) is a motion of the Euclidean plane, and a motion of the h-plane as well. Without loss of generality, we may assume that the center of the circumcircle coincides with the center of the absolute. In this case the h-medians of the triangle coincide with the Euclidean medians. It remains to apply Theorem 8.3.1.

Exercise (PageIndex{2})

Let (ell) and (m) are h-lines in the projective model. Let (s) and (t) denote the Euclidean lines tangent to the absolute at the ideal points of (ell). Show that if the lines (s), (t) and the extension of (m) intersect at one point, then (ell) and (m) are perpendicular h-lines.


Let (hat{ell}) and (hat{m}) denote the h-lines in the conformal model that correspond to (ell) and (m). We need to show that (hat{ell} perp hat{m}) as arcs in the Euclidean plane.

The point (Z), where (s) meets (t), is the center of the circle (Gamma) containing (hat{ell}).

If (hat{m}) is passing thru (Z), then the inversion in (Gamma) exchange the ideal points of (hat{ell}). In particular, (hat{ell}) maps to itself. Hence the result.

Exercise (PageIndex{3})

Use the projective model to derive the formula for angle of parallelism (Proposition 13.1.1).


Let (Q) be the foot point of (P) on the line and (varphi) be the angle of parallelism. We can assume that (P) is the center of the absolute. Therefore (PQ = cos varphi) and

(PQ_h = dfrac{1}{2} cdot ln dfrac{1 + cos varphi}{1 - cos varphi}.)

Exercise (PageIndex{4})

Use projective model to find the in radius of the ideal triangle.


Apply Exercise (PageIndex{3}) for (varphi = dfrac{pi}{3}).

The projective model of h-plane can be used to give another proof of the hyperbolic Pythagorean theorem (Theorem 13.6.1).

First let us recall its statement:

[cosh c=cosh acdotcosh b,]

where (a=BC_h), (b=CA_h), and (c=AB_h) and ( riangle_hACB) is an h-triangle with right angle at (C).

Note that we can assume that (A) is the center of the absolute. Set (s=BC), (t =CA), (u= AB). According to the Euclidean Pythagorean theorem (Theorem 6.2.1), we have


It remains to express (a), (b), and (c) using (s), (u), and (t) and show that 17.2.3 implies 17.2.2.

Advanced Exercise (PageIndex{5})

Finish the proof of hyperbolic Pythagorean theorem (Theorem 13.6.1) indicated above.


Note that (b = dfrac{1}{2} cdot ln dfrac{1 + t}{1 - t}), therefore,

[cosh b = dfrac{1}{2} cdot (sqrt{dfrac{1 + t}{1 - t}} + sqrt{dfrac{1 - t}{1 +t}}) = dfrac{1}{sqrt{1 - t^2}}.]

The same way we get that

[cosh c = dfrac{1}{sqrt{1 - u^2}}.]

Let (X) and (Y) are the ideal points of ((BC)_h). Applying the Pythagorean theorem (Theorem 6.2.1) again, we get that (CX = CY = sqrt{1 - t^2}). Therefore,

(a = dfrac{1}{2} cdot ln dfrac{sqrt{1 - t^2} + s}{sqrt{1 - t^2} - s},)


[cosh a = dfrac{1}{2} cdot (dfrac{sqrt{1 - t^2} + s}{sqrt{1 - t^2} - s} + dfrac{sqrt{1 - t^2} - s}{sqrt{1 - t^2} + s}) = dfrac{sqrt{1 - t^2}}{sqrt{1 - t^2 - s^2}} = dfrac{sqrt{1 - t^2}}{sqrt{1 - u^2}}.]

Finally, note that 17.2.5, 17.2.6, and 17.2.7 imply the theorem.

Introduction to Data Science

This screening test is 99% accurate in detecting some condition.

This algorithm detects fraudulent credit card transactions with an accuracy of 99%.

Both statements sound pretty good, but we should not stop there, but rather ask further questions.

What is the measure on which the test quality is being evaluated and reported? What about other measures?

What is the baseline performance and the performance of alternative algorithms?

Three key questions to ask before trying to predict anythying:

How is performance or predictive success evaluated?

What is the baseline performance and the performance of alternative benchmarks?

Distinguish between different prediction tasks and the measures used for evaluating success:

1. Two principle predictive tasks: classification vs. point predictions

2. Different measures for evaluating (quantifying) the quality of predictions

Note: It is tempting to view classification tasks and quantitative tasks as two types of “qualitative” vs. “quantitative” predictions. However, this would be misleading, as qualitative predictions are also evaluated in a quantitative fashion. Thus, we prefer to distinguish between different tasks, rather than different types of prediction.

17.2.1 Types of tasks

ad 1A: Two types of predictive tasks

qualitative prediction tasks: Classification tasks. Main goal: Predict the membership in some category.
Secondary goal: Evaluation by a 2x2 matrix of predicted vs. true cases (with 2 correct cases and 2 errors).

quantitative prediction tasks: Point predictions with numeric outcomes. Main goal: Predict some value on a scale.
Secondary goal: Evaluation by the distance between predicted and true values.

Note: Some authors (e.g., in 7 Fitting models with parsnip of Tidy Modeling with R) distinguish between different modes. The mode reflects the type of prediction outcome. For numeric outcomes, the mode is regression for qualitative outcomes, it is classification.

17.2.2 Evaluating predictive success

ad 1B: Given some prediction, how is predictive success evaluated (quantified)?

Remember the earlier example of the mammography problem: Screening has high sensitivity and specificity, but low PPV.

Note that — for all types of prediction — there are always trade-offs between many alternative measures for quantifying their success. Maximizing only one of them can be dangerous and misleading.

17.2.3 Baseline performance and other benchmarks

ad 2. On first glance, the instruction “Predict the phenomenon of interest with high accuracy.” seems a reasonable answer to the question “What characterizes a successful predictive algorithm?”
However, high accuracy is not very impressive if the baseline is already quite high. For instance, if it rains on only 10% of all summer days in some region, always predicing “no rain” will achieve an accuracy of 90%.

This seems trivial, but consider the above examples of detecting some medical condition or fraudulent credit card transactions: For a rare medical condition, a fake pseudo-test that always says “healthy” would achieve an impressive accuracy. Similarly, if over 99% of all credit card transactions are legitimate, always predicting “transaction is ok” would achieve an accuracy of over 99%…

Hence, we should always ask ourselves:

What is the lowest possible benchmark (e.g., for random predictions)?

What levels can be achieved by naive or very simple predictions?

As perfection is typically impossible, we need to decide how much better our rule needs to be than alternative algorithms. For this latter evaluation, it is important to know the competition.

Oracle® Data Miner

This document provides late-breaking information and information that is not yet part of the formal documentation.

This document contains the following topics:

New Features in Oracle Data Miner

Oracle Data Mining Features

The new Oracle Data Mining features include:

Association Model Aggregation Metrics

Oracle Data Miner 17.2 supports the enhanced Association Rules algorithm and allows the user to filter items before building the Association model.

The user can set the filters in the Association Build node editor, Association model viewer, and Model Details node editor.

Enhancements to Algorithm Settings

Oracle Data Miner 17.2 has been enhanced to support enhancements in Oracle Data Mining that includes build settings for building partition models, sampling of training data, numeric data preparation that includes shift and scale transformations, and so on.

These settings are available if Oracle Data Miner 17.2 is connected to Oracle Database 12.2.

Changes to the algorithms include:

Changes to Decision Tree Algorithm Settings

The setting Maximum Supervised Bins CLAS_MAX_SUP_BINS is added in the Decision Tree algorithm.

Changes to Expectation Maximization Algorithm Settings

The setting Level of Details replaces the current setting Gather Cluster Statistics.

The underlying algorithm setting used is EMCS_CLUSTER_STATISTICS where All=ENABLE, and Hierarchy=DISABLE. Some additional settings are added and some settings are deprecated.


Changes to Generalized Linear Models Algorithm Settings

The following changes are included in the Generalized Linear Model algorithm settings. The changes apply to both Classification models and Regression models.

Convergence Tolerance GLMS_CONV_TOLERANCE

Number of Iterations GLMS_NUM_ITERATIONS




Categorical Predictor Treatment GLMS_SELECT_BLOCK

Sampling for Feature Identification GLMS_FTR_IDENTIFICATION

Feature Acceptance GLMS_FTR_ACCEPTANCE

Changes to k -Means Algorithm Settings

The following changes are incorporated to the k -Means algorithm settings.

Levels of Details KMNS_DETAILS


Changes to Support Vector Machine Algorithm Settings

The following changes are included in the Support Vector Machine algorithm settings. The changes are applicable to both Linear and Gaussian kernel functions.

Number of Iterations SVMS_NUM_ITERATIONS


Applies to Gaussian kernel function only.

Applies to Gaussian kernel function only.

Changes to Singular Value Decomposition and Principal Components Analysis Algorithm Settings

The following changes are included in the Singular Value Decomposition and Principal Components Analysis algorithm.





Support for Explicit Semantic Analysis Algorithm

Oracle Data Miner 17.2 supports a new feature extraction algorithm called Explicit Semantic Analysis algorithm.

The algorithm is supported by two new nodes, that are Explicit Feature Extraction node and Feature Compare node.

Explicit Feature Extraction Node

The Explicit Feature Extraction node is built using the Explicit Semantic Analysis algorithm.

Calculations related to semantics

Feature Compare Node

The Feature Compare node enables you to perform calculations related to semantics in text data, contained in one Data Source node against another Data Source node.

Two input data sources. The data source can be data flow of records, such as connected by a Data Source node or a single record data entered by user inside the node. In case of data entered by users, input data provider is not needed.

One input Feature Extraction or Explicit Feature Extraction Model, where a model can be selected for calculations related to semantics.

Enhancement to Data Mining Model Detail View

The model viewers in Oracle Data Miner 17.2 have been enhanced to reflect the changes in Oracle Data Mining.

Enhancements to the model viewers include the following:

The computed settings within the model are displayed in the Settings tab of the model viewer.

The new user embedded transformation dictionary view is integrated with the Inputs tab under Settings.

The build details data are displayed in the Summary tab under Summary

The Cluster model viewer detects models with partial details, and displays a message indicating so. This also applies to k- Means model viewer and Expectation Maximization model viewers.

Enhancements to Filter Column Node

Oracle Data mining supports unsupervised Attribute Importance ranking. The Attribute Importance ranking of a column is generated without the need for selecting a target column. The Filter Column node has been enhanced to support unsupervised Attribute Importance ranking.

Mining Model Build Alerts

Oracle Data Miner logs alerts related to model builds in the model viewers and event logs.

Model viewers: The build alerts are displayed in the Alerts tab.

Event log: All build alerts are displayed along with other details such as job name, node, sub node, time, and message.

R Build Model Node

Oracle Data Mining provides the feature to add R model implementations within the Oracle Data Mining framework. To support R model integration, Oracle Data Miner has been enhanced with a new R Build node with mining functions such as Classification, Regression, Clustering, and Feature Extraction.

Support for Partitioned Models

Oracle Data Miner supports the building and testing of partitioned models.

Oracle Data Miner Features

The new Oracle Data Miner features include:

Aggregation Node Support for DATE and TIMESTAMP Data Types

The Aggregation node has been enhanced to support DATE and TIMESTAMP data types.

For DATE and TIMESTAMP data types, the functions available are COUNT(), COUNT (DISTINCT()), MAX(), MEDIAN(), MIN(), STATS_MODE().

Enhancement to JSON Query Node

The JSON Query node allows to specify filter conditions on attributes with data types such as ARRAY, BOOLEAN, NUMBER and STRING.

JSON Unnest &mdash Applies filter to JSON data that is used for projection to relational data format.

Aggregations &mdash Applies filters to JSON data that is used for aggregation.

JSON Unnest and Aggregations &mdash Applies filter to both.

Enhancement to Build Nodes

All Build nodes are enhanced to support sampling of training data and preparation of numeric data.

The enhancement is implemented in the Sampling tab in all Build nodes editors. By default, the Sampling option is set to OFF. When set to ON , the user can specify the sample row size or choose the system determined settings.

Data preparation is not supported in Association Build model.

Edit Anomaly Detection Node

Edit Association Build Node

Edit Classification Build Node

Edit Clustering Build Node

Edit Explicit Feature Extraction Build Node

Edit Feature Extraction Build Node

Edit Regression Build Node

Enhancement to Text Settings

Text settings are enhanced to support the following features:

Text support for synonyms (thesaurus): Text Mining in Oracle Data Miner supports synonyms. By default, no thesaurus is loaded. The user must manually load the default thesaurus provided by Oracle Text or upload his own thesaurus.

New settings added in Text tab:

Minimum number of rows (documents) required for a token

Max number of tokens across all rows (documents)

New tokens added for BIGRAM setting:

BIGRAM: Here, NORMAL tokens are mixed with their bigrams

STEM BIGRAM: Here, STEM tokens are extracted first and then stem bigrams are formed.

Refresh Input Data Definition

Use the Refresh Input Data Definition option if you want to update the workflow with new columns, that are either added or removed.

The Refresh Input Data Definition option is available as a context menu option in Data Source nodes and SQL Query nodes.

Support for Additional Data Types

Oracle Data Miner allows the following data types for input as columns in a Data Source node, and as new computed columns within the workflow:

Support for In-Memory Column

Oracle Data Miner supports In-Memory Column Store (IM Column Store) in Oracle Database and later, which is an optional static SGA pool that stores copies of tables and partitions in a special columnar format.

Oracle Data Miner has been enhanced to support In-Memory Column in nodes in a workflow. For In-Memory Column settings, the options to set Data Compression Method and Priority Level are available in the Edit Node Performance Settings dialog box.

Support for Workflow Scheduling

Oracle Data Miner 17.2 supports the feature to schedule workflows to run at a definite date and time.

A scheduled workflow is available only for viewing. The option to cancel a scheduled workflow is available. After cancelling a scheduled workflow, the workflow can be edited and rescheduled.

Enhancement to Polling Performance

Polling performance and resource utilization functionality has been enhanced with new user interfaces.

When POLLING_IDLE_ENABLED is set to TRUE, then automatic query for workflow status sets in. When POLLING_IDLE_ENABLED is set to FALSE, then manual query sets in.

A new dockable window Scheduled Workflow has been added that displays the list of scheduled jobs and allows the user to manage the scheduled jobs.

Manual refresh of workflow jobs.

Administrative override of automatic updates through Oracle Data Miner repository settings.

Access to Workflow Jobs preferences through the new Settings option.

Workflow Status Polling Performance Improvement

The performance of workflow status polling has been enhanced.

The enhancement includes new repository views, repository properties, and user interface changes:

The repository view ODMR_USER_WORKFLOW_ALL_POLL is added for workflow status polling.

The following repository properties are added:

POLLING_IDLE_RATE: Determines the rate at which the client will poll the database when there are no workflows detected as running.

POLLING_ACTIVE_RATE: Determines the rate at which the client will poll the database when there are workflows detected running.

When POLLING_IDLE_ENABLED is set to TRUE, then automatic query for workflow status sets in. When POLLING_IDLE_ENABLED is set to FALSE, then manual query sets in.

POLLING_COMPLETED_WINDOW: Determines the time required to include completed workfows in the polling query result.

PURGE_WORKFLOW_SCHEDULER_JOBS: Purges old Oracle Scheduler objects generated by the running of Data Miner workflows.

PURGE_WORKFLOW_EVENT_LOG: Controls how many workflow runs are preserved for each workflow in the event log. The events of the older workflow are purged to keep within the limit.

New user interface includes the Scheduled Jobs window which can be accessed from the Data Miner option in both Tools menu and View menu in SQL Developer 17.2 .

Oracle Database Features

The new Oracle Database feature includes Support for Expanded Object Name.

Support for Expanded Object Name

The support for schema name, table name, column name, and synonym that are 128 bytes are available in the upcoming Oracle Database release. To support Oracle Database, Oracle Data Miner repository views, tables, XML schema, and PL/SQL packages are enhanced to support 128 bytes names.

Supported Platforms

Prerequisites for Oracle Data Miner 17.2

  1. Install SQL Developer 17.2 on your system.
  2. Secure access to an Oracle Database:
    • Minimum version: Oracle Database Enterprise Edition, with the Data Mining option.
    • Preferred version: Oracle Database 12.2 Enterprise Edition.
  3. Create a database user account for data mining.
  4. Create a database connection within SQL Developer for the Oracle Data Miner user.
  5. Install the Oracle Data Miner repository.

The SH sample schema is not shipped with Oracle Database 12.2. To install the sample schema, go to

Known Problems and Limitations

Known problems and limitations in this release includes:

Association Model Build node cannot consume data coming directly from JSON Query node.

Users must persist the data coming from the JSON Query node through Create Table node, and then use the persisted data as input to the Associate Model Build node.

Classification nodes and Regression Model Build nodes are unable to consume data directly coming from JSON Query node if JSON Aggregations (with Sub Group By) are defined.

Users must persist the data coming from the JSON Query node through Create Table node, and then use the persisted data as input to these Build nodes.

Build nodes can consume data directly coming from JSON Query nodes if JSON Aggregations (without Sub Group By) are not defined.

Setting Parallel Query for a node that queries JSON data can result in a workflow runtime error. JSON queries will fail if they are run with the database Parallel Query set to ON. The following error message is displayed ORA-12805: Parallel Query server died unexpectedly.

The Node context menu has the option to set Parallel Query. Click Parallel Query and select the nodes to configure the parallel settings.

The View Data viewer provides the option to set Parallel Query to ON when querying the selected Data Nodes.

In both the cases, the error occurs and the same error message is displayed.

Request the Oracle Database patch through Oracle Support.

You can ignore these error messages during Oracle Data Miner 17.2 installation, if no exceptions are generated.

Bug Fixes

Oracle Data Miner 17.2 has 122 bugs fixed.

Documentation Accessibility

For information about Oracle's commitment to accessibility, visit the Oracle Accessibility Program website at

Oracle customers that have purchased support have access to electronic support through My Oracle Support. For information, visit or visit if you are hearing impaired.

Oracle® Data Miner Release Notes , Release 17.2

Copyright © 2016, 2017, Oracle and/or its affiliates. All rights reserved.

This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited.

The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing.

If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, then the following notice is applicable:

U.S. GOVERNMENT END USERS: Oracle programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, delivered to U.S. Government end users are "commercial computer software" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, use, duplication, disclosure, modification, and adaptation of the programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, shall be subject to license terms and license restrictions applicable to the programs. No other rights are granted to the U.S. Government.

This software or hardware is developed for general use in a variety of information management applications. It is not developed or intended for use in any inherently dangerous applications, including applications that may create a risk of personal injury. If you use this software or hardware in dangerous applications, then you shall be responsible to take all appropriate fail-safe, backup, redundancy, and other measures to ensure its safe use. Oracle Corporation and its affiliates disclaim any liability for any damages caused by use of this software or hardware in dangerous applications.

Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.

Intel and Intel Xeon are trademarks or registered trademarks of Intel Corporation. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. AMD, Opteron, the AMD logo, and the AMD Opteron logo are trademarks or registered trademarks of Advanced Micro Devices. UNIX is a registered trademark of The Open Group.

This software or hardware and documentation may provide access to or information about content, products, and services from third parties. Oracle Corporation and its affiliates are not responsible for and expressly disclaim all warranties of any kind with respect to third-party content, products, and services unless otherwise set forth in an applicable agreement between you and Oracle. Oracle Corporation and its affiliates will not be responsible for any loss, costs, or damages incurred due to your access to or use of third-party content, products, or services, except as set forth in an applicable agreement between you and Oracle.

Introduction to Data Science

The basic models of agents and environments discussed in the previous chapter (see Chapter 16 on Dynamic simulations) were typically developed for individuals. For instance, the framework of reinforcement learning (RL, or the more general class of Markov Decision Processes, MDPs) allows an agent to learn a strategy that maximizes some reward in a stable environment. The options in this environment may be stochastic (e.g., a multi-armed bandit, MAB). Provided that the agent has sufficient time to explore all options, the agent is guaranteed to find the best option. However, if fundamental aspects of the environment change (e.g., by adding options or changing their reward functions), a learning agent may no longer converge on the best option.

Social situations typically change an environment in several ways. Depending on the details of the agent-environment interaction (e.g., how are rewards distributed), they may introduce competition between agents, opportunities for cooperation, and new types of information and learning. For instance, the success of any particular agent strategy can crucially depend on the strategies of other agents. Thus, allowing for other agents questions many results that hold for individual situations and calls for additional types of models and modeling.

In this chapter, we introduce three basic paradigms of modeling social situations:

17.2.1 Games

The scientific discipline of game theory has little resemblance to children’s games, but instead studies mathematical models of strategic interactions among rational decision-makers (see Wikipedia). In a similar vein, the Stanford Encyclopedia of Philosophy defines game theory as

… the study of the ways in which interacting choices of economic agents
produce outcomes with respect to the preferences (or utilities) of those agents,
where the outcomes in question might have been intended by none of the agents.

Ross (2019)

The article immediately adds that the meaning of this definition requires an understanding of the italicized concepts (and provides these details in the corresponding article). For us, it is interesting that games are expressed in terms of economic exchanges, involving agents that pursue their goals, but may obtain outcomes that also depend on the goals and actions of other agents.

We can define a game as a payoff matrix that shows the options and corresponding rewards for its players (in terms of points gained or lost for each combination of outcomes).

One of the simplest possible games is illustrated by Table 17.1. In the matching pennies (MP) game, two players each select one side of a coin (i.e., heads H, or tails T). One player wins if both sides match, the other player wins whenever both sides differ. The payoff matrix shown in Table 17.1 illustrates the options of Player 1 as rows and the first payoff value in each cell, whereas the options of Player 2 are shown as columns and the second payoff values. Thus, Player 1 wins a point ( (+1) ) and Player 2 loses a point ( (-1) ) whenever both players selected the same side of their coins (“HH” or “TT”). Player 2 wins a point ( (+1) ) and Player 1 loses a point ( (-1) ) whenever both players selected different sides of their coins (“HT” or “TH”).

Table 17.1: The payoff matrix of the matching pennies game.
Options H T
H ((+1, -1)) ((-1, +1))
T ((-1, +1)) ((+1, -1))

Winning a game can correspond to maximizing one’s individual reward in a game or to gaining more reward than other players of the same game. As the outcome of any single game may be highly dependent on chance factors, the success of a game-playing strategy is typically assessed over repeated instances of the same game. The goal of maximizing one’s own rewards can create dilemmas when contrasting different levels of aggregation (e.g., single vs. repeated games, or individual vs. population gains).

Games can be classified into types by comparing the players’ reward functions. The most important types of games are:

zero-sum or pure competitive games: Some player’s wins correspond to another player’s losses, so that the total of all rewards add up to zero (0).

common interest, identical or symmetrical games: All players have the same reward function

general sum games: An umbrella category for other games (i.e., not of the other types).

Note that these categories are not mutually exclusive. For instance, the matching pennies game (shown in Table 17.1) is a symmetrical, purely competitive, zero-sum game. If one player’s win did not correspond to the other player’s loss (e.g., all payoffs of (-1) were replaced by payoffs of (0) ), the game is no longer a zero-sum game, but would still be a symmetrical, common interest game.

To provide a contrast to the competitive matching pennies game, Table 17.2 shows a coordination game. As both players still have identical payoffs, this is a symmetrical common interest game in which both players win by choosing the same side of their coins. Note that there are two different local maxima in this game: As long as both options match, no player is motivated to alter her or his choice. The notion of using the best response given the current choices of other players is an important solution concept in game theory, known as the Nash equilibrium (NE). In the coordination game defined by Table 17.2, the payoffs for TT are higher than for HH, but if a player has reasons to assume that the opponent will choose H, then selecting H is better than insisting on T.

Table 17.2: The payoff matrix of a coordination game.
Options H T
H ((1, 1)) ((0, 0))
T ((0, 0)) ((2, 2))

See Nowé, Vrancx, & De Hauwere (2012) , for the payoff matrices and descriptions of additional types of games. (We will address the prisoner’s dilemma, PD, and the battle of the sexes, BoS, game in Exercise 17.4.2.)

Games can be further characterized by the order of play (i.e., whether players make sequential vs. simultaneous moves) and by the challenges they pose to their players (e.g., competitive vs. coordination games). Most games assume some balanced and shared understanding of a game’s goals and rules (e.g., all players may either know their own options and payoffs, or even know the full payoff matrix). However, different games and different versions of the same game can differ substantially in the transparency of players’ actions, rewards, and strategies.

A key aspect in modeling games concerns each player’s knowledge of the goals, actions, and rewards of other players. From a modeler’s perspective, we must be aware which aspect of a potentially complex situation is being addressed by a model. This essentially boils down to asking: Which research question is being addressed by modeling this game?

Modeling a game

Given our experience in modeling dynamic agents and environments (from Chapter 16), we may not need any new tools to model strategic games. From the perspective of a player (e.g., a RL agent), the strategy and behavior of other players are part of the environment.

To do something more interesting, we will adopt a simulation created by Wataru Toyokawa (which can be found here). This will be a bit challenging, but extend our previous models in three useful ways:

Rather than simulating the repeated encounters of two individuals, we will simulate a spatial population of agents in a lattice/torus design.

Rather than using our basic learning model from Section 16.2.1, we implement a more sophisticated learning model: An epsilon-greedy Q learning model, which uses a softmax criterion for choosing options.

Given that we are simulating a spatially arranged population of agents over time, we will visualize their choices as an animation.

Methodologically, the extension from individual agents to an entire population of learning agents also motivates some 3-dimensional data structures (i.e., arrays) that provide slots for each agent/player (p) and time step (t) ).

Basic setup

  1. Simulation parameters:
  • simulating a population of individuals, arranged in square (a so-called lattice or torus)

Basic idea: Approximating the dynamics of a large (ideally infinite) system by simulating a spatial grid of its parts.

In each round, each individual plays against its 8 neighbors (defined as either sharing a side or a corner with the focal individual).

Boundary condition: Individuals on the edge(s) play against those on the edge of the opposite side (bottom vs. top, left vs. right).

Figure 17.1 shows the resulting structure of 100 individuals. In each round, the focal individuals in the darker (blue/green/pink) color play their eight neighbors in lighter (blue/green/pink) color.

Figure 17.1: Illustrating a torus structure of 100 individuals arranged on a 10x10 lattice.

The lattice/torus design creates a spatial continuum of agents, and strategies, and interactions. From the perspective of each individual agent, this setup multiplies the number of games she plays and the diversity of strategies she encounters. From the modeler’s perspective, this setup allows assessing the density of each strategy at each point in time, as well as the diffusion of strategies over time. Under some circumstances, a global pattern may emerge out of local interactions.

Note the 3-dimensional array of Q values (with dimensions of N_opt , N_t , and N_p , i.e., providing 2 x 100 x 100 slots).


Running the simulation as a for loop (for each time step (t) ):

  • Each player selects her choice once at the beginning of each trial. Thus, it does not consider its opponents individually.


Visualizing results for a population of agents requires calculating the density of their choices, rather than individual ones. This is why we computed the density_1 and density_1 vectors during the simulation:

Creating an animated plot

To visualize the spatical dynamics of agent choices over time, we can show which opten each agent on the lattice/torus has chosen on every trial. This can be visualized as an animated gif, in two steps:

  • Prepare data: Transform the 3-dimensional array of all_choices into a 2-dimensional table all_choices_tb
  • Create an animated image (by using the gganimate and gifski packages):

The resulting Figure 17.2 shows that both options are equally prominent at first, but the superior Option 2 is becoming more popular from Trial 10 onwards. Due to the agents’ tendency to explore, Option 1 is still occasionally chosen.

Figure 17.2: The distribution of agent choices over time in the coordination game.


This exercise adjusts the simulation of the coordination game to the matching pennies (MP) game (defined by Table 17.1):

What is the optimal strategy that agents should learn in a purely competitive game?

What needs to be changed to use the above model to simulate this game?

Adopt the simulation to verify your answers to 1. and 2.


ad 1.: As any predictable agent could be exploited, the optimal choice for an individual agent is to randomly choose options. However, it is unclear what this would mean for a population of agents arranged on a lattice/torus.

ad 2.: Theoretically, we only need to re-define the payoff matrix of the game (shown in Table 17.1). However, as the model above yields an error (when converting negative payoffs into probabilities), we add 1 to all payoffs. This turns a zero-sum game into a competitive game with symmetrical payoffs, but has no effect on the absolute differences in payoffs that matter for our RL agents):

  • ad 3.: Figure 17.3 shows the result of a corresponding simulation. In this particular simulation, a slight majority of agents chose Option 1, but the population did not reach a stable equilibrium. As each agent’s optimal strategy is to choose options randomly (i.e., with a (p=.5) for each option), this could be evidence for learning. But to really demonstrate successful learning, further examinations would be necessary.

Figure 17.3: The distribution of agent choices over time in a competitive game.

17.2.2 Social learning

Learning can be modeled as a function of rewards and the behavior of others. When an option is better than another, it offers higher rewards and will be learned by an agent that explores and exploits an environment to maximize her utility. However, when other agents are present, a second indicator of an option’s quality is its popularity: Other things being equal, better options are more popular. 19

The basic idea of replicator dynamics (following Page, 2018, p. 308ff.) is simple: The probability of choosing an action is the product of its reward and its popularity.

Given a set of (N_opt) alternatives with corresponding rewards (pi(1) . pi(n)) , the probability of choosing an option (k) at time step (t+1) is defined as:

Note that the factor given by the fraction (frac<ar>>) divides the option’s current reward by the current average reward of all options. Its denominator is computed as the sum of all reward values weighted by their probability in the current population. As more popular options are weighted more heavily, this factor combines an effect of reward with an effect of popularity or conformity. Thus, the probability of choosing an option on the next decision cycle depends on its current probability, its current reward, and its current popularity.

Note that this particular conceptualization provides a model for an entire population, rather than any individual element of it. Additionally, the rewards received from each option are assumed to be fixed and independent of the choices, which may be pretty implausible for many real environments.

What changes would signal that the population is learning? The probability distribution of actions being chosen in each time step.

Implementation in R

Given that we collected all intermediate values in data , we can inspect our simulation results by printing the table:

Table 17.3: Data from replicator dynamics.
t avg_rew p_A p_B p_C
0 10.00 0.100 0.700 0.200
1 11.50 0.200 0.700 0.100
2 13.26 0.348 0.609 0.043
3 15.16 0.525 0.459 0.016
4 16.89 0.692 0.303 0.005
5 18.18 0.819 0.179 0.002
6 19.01 0.901 0.099 0.000
7 19.48 0.948 0.052 0.000
8 19.73 0.973 0.027 0.000
9 19.87 0.987 0.013 0.000
10 19.93 0.993 0.007 0.000

As shown in Table 17.3, we initialized our loop at a value of t = 0 . This allowed us to include the original situation (prior to any updating of prob ) as the first line (i.e., in row data[(t + 1), ] ).

Inspecting the rows of Table 17.3 makes it clear that the probabilities of suboptimal options (here: Options B and C) are decreasing, while the probability of choosing the best option (A) is increasing. Thus, options with higher rewards are becoming more popular — and the population quickly converges on choosing the best option.

The population’s systematic shift from poorer to richer options also implies that the value of the average reward (here: avg_rew ) is monotonically increasing and approaching the value of the best option. Thus, the function of avg_rew is similar to the role of an aspiration level (A) in reinforcement learning models (see Section 16.2.1), with the difference that avg_rew reflects the average aspiration of the entire population, whereas (A_) denotes the aspiration level of an individual agent (i) .

Visualizing results

Ways of depicting the shift in collective dynamics away from bad and towards the best option is provided by the following visualizations of data . Figure 17.4 shows the trends in choosing each option as a function fo the time steps 0 to 10:

Figure 17.4: Trends in the probabilty of choosing each option per time step.

Note that we re-formatted data into long format prior to plotting (to get the options as one variable, rather than as three separate variables) and changed the y-axis to a percentage scale.

Given that the probability distribution at each time step must sum to 1, it makes sense to display them as a stacked bar chart, with different colors for each option (see Figure 17.5):

Figure 17.5: The probabilty of choosing options per time step.

This shows that the best option (here: Option A) is unpopular at first, but becomes the dominant one by about the 4-th time step. The learning process shown here appears to be much quicker than that of an individual reinforcement learner (in Section 16.2.1). This is mostly due to a change in our level of analysis: Our model of replicator dynamics describes an entire population of agents. In fact, our use of the probability distribution as a proxy for the popularity of options implicitly assumes that an infinite population of agents is experiencing the environment and both exploring and exploiting the full set of all options on every time step. Whereas an individual RL agent must first explore options and — if it gets unlucky — waste a lot of valuable time on inferior options, a population of agents can evaluate the entire option range and rapidly converge on the best option.

Convergence on the best option is guaranteed as long all options are chosen (i.e., have an initial probability of (p_(i)>0) ) and the population of agents is large (ideally infinite).


Answer the following questions by studying the basic equation of replicator dynamics:

What’s the impact of options that offer zero rewards (i.e, (pi(k)=0) )?

What’s the impact of options that are not chosen (i.e, (P_(k)=0) )?

What happens if the three options (A–C) yield identical rewards (e.g., 10 units for each option)?

What happens if the rewards of three options (A–C) yield rewards of 10, 20, and 70 units, but their initial popularities are reversed (i.e., 70%, 20%, 10%).

Do 10 time steps still suffice to learn the best option?

What if the initial contrast is even more extreme, with rewards of 1, 2, and 97 units, and initial probabilities of 97%, 2%, and 1%, respectively?

Hint: Run these simulations in your mind first, then check your predictions with the code above.

What real-world environment could qualify as

one in which the rewards of all objects are stable and independent of the agents’ actions?

one in which all agents also have identical preferences?

How would the learning process change

if the number of agents was discrete (e.g., 10)?

if the number of options exceeded the number of agents?

Hint: Consider the range of values that prob would and could take in both cases.

Value of Projective Tests:

Though these projective tests are developed for understanding human behavior and emotions, not many people completely agree with its outcomes. Despite the fact that there are many kinds of limitations to these projective tests, they are still used by many psychiatrists and psychologists.

Also, many experts who work on these projective tests are working on updating these tests such that they not only provide validity but also add some value.

Projective tests are also been used in market research to evaluate the emotions, associations, and thought processes related to the brand and products.

2021 Pennzoil 400 odds, predictions: Surprising NASCAR at Las Vegas picks from advanced model

Millions flock to Las Vegas every year with hopes of hitting the jackpot and returning as an instant millionaire. On Sunday, 40 drivers will line up with hopes their number lands in victory lane in the 2021 Pennzoil 400. The green flag drops at 3:30 p.m. ET from the 1.5-mile Las Vegas Motor Speedway, with defending champion Joey Logano starting from the No. 15 spot. Kevin Harvick, who has won twice at Las Vegas and will start on the pole Sunday, is the 9-2 favorite in the 2021 Pennzoil 400 odds from William Hill Sportsbook.

Denny Hamlin is 6-1, while Logano and Chase Elliott are both 15-2, Brad Keselowski is 8-1, and Kyle Larson is 17-2 on the 2021 NASCAR at Las Vegas odds board. Before you scour the 2021 Pennzoil 400 starting lineup and make your NASCAR at Las Vegas predictions for Sunday, be sure to see the latest 2021 Pennzoil 400 picks from SportsLine's proven projection model.

Developed by daily Fantasy pro and SportsLine predictive data engineer Mike McClure, this proprietary NASCAR prediction computer model simulates every race 10,000 times, taking into account factors such as track history and recent results.

The model began the 2020 season paying out big by picking Denny Hamlin to win his second consecutive Daytona 500 at 10-1. The model also called Kevin Harvick's win at Atlanta and nailed a whopping nine top-10 finishers in that race. McClure then used the model to lock in a 10-1 bet on Hamlin for his win at Miami.

At The Brickyard, the model called Harvick's fourth victory of the season. Then during the 2020 NASCAR Playoffs, the model nailed its picks in back-to-back races, calling Denny Hamlin to win at 17-2 at Talladega and Chase Elliott to win at 7-2 at the Charlotte Roval. Anybody who has followed its NASCAR picks has seen huge returns.

Top 2021 Pennzoil 400 predictions

The model is high on Ryan Blaney, even though he's a 12-1 long shot in the latest NASCAR at Las Vegas odds 2021. He's a target for anyone looking for a huge payday. Blaney has been tantalizingly close to Victory Lane at Las Vegas, finishing second in the 2015 Xfinity Series race and as high as fifth - three times - in his NASCAR Cup Series cars.

The Team Penske driver is off to a rough start in his 2021 campaign, finishing 30th at the Daytona 500, 15th at the O'Reilly Auto Parts 253, and 29th last week at the Dixie Vodka 400. However, Blaney was at the front in last spring's Pennzoil 400 and last fall's South Point 400, and has plenty of experience behind the wheel of his No. 12 Ford.

Blaney will begin from the No. 26 position in the 2021 Pennzoil 400 starting lineup, but McClure sees him getting to the front quickly and loves his value as part of your 2021 Pennzoil 400 bets.

And a massive shocker: Kevin Harvick, the top Vegas favorite at 9-2, stumbles big-time and barely cracks the top 10. There are far better values in this loaded 2021 Pennzoil 400 starting grid. Harvick is a two-time winner at Las Vegas and is coming off a career-best nine victories in 2020, but he missed out on his chance to race for a second championship thanks to a disappointing stretch run.

That included a 10th place finish at the 2020 South Point 400 in Las Vegas after starting on the pole and a 16th-place finish in the last 1.5-mile oval race of the season in Texas. Harvick is now on an eight-race winless streak, and 9-2 is too steep a price to pay.

How to make 2021 NASCAR at Las Vegas picks

The model is also targeting two other drivers with NASCAR at Las Vegas odds 2021 of 10-1 or longer to make a serious run at winning it all. Anyone who backs these drivers could hit it big. You can see all the NASCAR picks over at SportsLine.

So who wins the 2021 Pennzoil 400? And which long shots stun NASCAR? Check out the latest 2021 Pennzoil 400 odds below, then visit SportsLine now to see the full projected 2021 Pennzoil 400 leaderboard, all from the model that nailed Hamlin's win at the 2020 Daytona 500.

2021 Pennzoil 400 odds

Kevin Harvick 9-2
Martin Truex Jr. 6-1
Denny Hamlin 6-1
Chase Elliott 15-2
Joey Logano 15-2
Brad Keselowski 8-1
Kyle Larson 17-2
Kyle Busch 10-1
Ryan Blaney 12-1
William Byron 18-1
Alex Bowman 20-1
Kurt Busch 25-1
Christopher Bell 28-1
Austin Dillon 40-1
Aric Almirola 40-1
Tyler Reddick 60-1
Cole Custer 75-1
Matt DiBenedetto 75-1
Chris Buescher 100-1
Ryan Newman 100-1
Bubba Wallace 100-1
Chase Briscoe 100-1
Ross Chastain 125-1
Ricky Stenhouse Jr. 125-1
Michael McDowell 150-1
Erik Jones 150-1
Daniel Suarez 250-1
Ryan Preece 1000-1
Justin Haley 2500-1
Anthony Alfredo 2500-1
Corey Lajoie 2500-1
Timmy Hill 5000-1
Cody Ware 5000-1
Garrett Smithley 5000-1
Josh Bilicki 5000-1
Joey Gase 5000-1
BJ McLeod 5000-1
Quin Houff 5000-1


Lifting property Edit

The usual category theoretical definition is in terms of the property of lifting that carries over from free to projective modules: a module P is projective if and only if for every surjective module homomorphism f : NM and every module homomorphism g : PM , there exists a module homomorphism h : PN such that f h = g . (We don't require the lifting homomorphism h to be unique this is not a universal property.)

The advantage of this definition of "projective" is that it can be carried out in categories more general than module categories: we don't need a notion of "free object". It can also be dualized, leading to injective modules. The lifting property may also be rephrased as every morphism from P to M factors through every epimorphism to M . Thus, by definition, projective modules are precisely the projective objects in the category of R-modules.

Split-exact sequences Edit

A module P is projective if and only if every short exact sequence of modules of the form

is a split exact sequence. That is, for every surjective module homomorphism f : BP there exists a section map, that is, a module homomorphism h : PB such that f h = idP. In that case, h(P) is a direct summand of B, h is an isomorphism from P to h(P) , and h f is a projection on the summand h(P) . Equivalently,

Direct summands of free modules Edit

A module P is projective if and only if there is another module Q such that the direct sum of P and Q is a free module.

Exactness Edit

An R-module P is projective if and only if the covariant functor Hom(P, -): R-ModAb is an exact functor, where R-Mod is the category of left R-modules and Ab is the category of abelian groups. When the ring R is commutative, Ab is advantageously replaced by R-Mod in the preceding characterization. This functor is always left exact, but, when P is projective, it is also right exact. This means that P is projective if and only if this functor preserves epimorphisms (surjective homomorphisms), or if it preserves finite colimits.

Dual basis Edit

The following properties of projective modules are quickly deduced from any of the above (equivalent) definitions of projective modules:

  • Direct sums and direct summands of projective modules are projective.
  • If e = e 2 is an idempotent in the ring R, then Re is a projective left module over R.

The relation of projective modules to free and flat modules is subsumed in the following diagram of module properties:

The left-to-right implications are true over any ring, although some authors define torsion-free modules only over a domain. The right-to-left implications are true over the rings labeling them. There may be other rings over which they are true. For example, the implication labeled "local ring or PID" is also true for polynomial rings over a field: this is the Quillen–Suslin theorem.

Projective vs. free modules Edit

Any free module is projective. The converse is true in the following cases:

  • if R is a field or skew field: any module is free in this case.
  • if the ring R is a principal ideal domain. For example, this applies to R = Z (the integers), so an abelian group is projective if and only if it is a free abelian group. The reason is that any submodule of a free module over a principal ideal domain is free.
  • if the ring R is a local ring. This fact is the basis of the intuition of "locally free = projective". This fact is easy to prove for finitely generated projective modules. In general, it is due to Kaplansky (1958) see Kaplansky's theorem on projective modules.

In general though, projective modules need not be free:

  • Over a direct product of ringsR × S where R and S are nonzero rings, both R × 0 and 0 × S are non-free projective modules.
  • Over a Dedekind domain a non-principal ideal is always a projective module that is not a free module.
  • Over a matrix ring Mn(R), the natural module Rn is projective but not free. More generally, over any semisimple ring, every module is projective, but the zero ideal and the ring itself are the only free ideals.

The difference between free and projective modules is, in a sense, measured by the algebraic K-theory group K0(R), see below.

Projective vs. flat modules Edit

Every projective module is flat. [1] The converse is in general not true: the abelian group Q is a Z-module which is flat, but not projective. [2]

Conversely, a finitely related flat module is projective. [3]

Govorov (1965) and Lazard (1969) proved that a module M is flat if and only if it is a direct limit of finitely-generated free modules.

In general, the precise relation between flatness and projectivity was established by Raynaud & Gruson (1971) (see also Drinfeld (2006) and Braunling, Groechenig & Wolfson (2016)) who showed that a module M is projective if and only if it satisfies the following conditions:

  • M is flat,
  • M is a direct sum of countably generated modules,
  • M satisfies a certain Mittag-Leffler type condition.

Submodules of projective modules need not be projective a ring R for which every submodule of a projective left module is projective is called left hereditary.

Quotients of projective modules also need not be projective, for example Z/n is a quotient of Z, but not torsion free, hence not flat, and therefore not projective.

The category of finitely generated projective modules over a ring is an exact category. (See also algebraic K-theory).

Given a module, M, a projective resolution of M is an infinite exact sequence of modules

with all the Pis projective. Every module possesses a projective resolution. In fact a free resolution (resolution by free modules) exists. The exact sequence of projective modules may sometimes be abbreviated to P(M) → M → 0 or PM → 0 . A classic example of a projective resolution is given by the Koszul complex of a regular sequence, which is a free resolution of the ideal generated by the sequence.

The length of a finite resolution is the subscript n such that Pn is nonzero and Pi = 0 for i greater than n. If M admits a finite projective resolution, the minimal length among all finite projective resolutions of M is called its projective dimension and denoted pd(M). If M does not admit a finite projective resolution, then by convention the projective dimension is said to be infinite. As an example, consider a module M such that pd(M) = 0 . In this situation, the exactness of the sequence 0 → P0M → 0 indicates that the arrow in the center is an isomorphism, and hence M itself is projective.

Projective modules over commutative rings have nice properties.

The localization of a projective module is a projective module over the localized ring. A projective module over a local ring is free. Thus a projective module is locally free (in the sense that its localization at every prime ideal is free over the corresponding localization of the ring).

The converse is true for finitely generated modules over Noetherian rings: a finitely generated module over a commutative noetherian ring is locally free if and only if it is projective.

However, there are examples of finitely generated modules over a non-Noetherian ring which are locally free and not projective. For instance, a Boolean ring has all of its localizations isomorphic to F2, the field of two elements, so any module over a Boolean ring is locally free, but there are some non-projective modules over Boolean rings. One example is R/I where R is a direct product of countably many copies of F2 and I is the direct sum of countably many copies of F2 inside of R. The R-module R/I is locally free since R is Boolean (and it is finitely generated as an R-module too, with a spanning set of size 1), but R/I is not projective because I is not a principal ideal. (If a quotient module R/I, for any commutative ring R and ideal I, is a projective R-module then I is principal.)

However, it is true that for finitely presented modules M over a commutative ring R (in particular if M is a finitely generated R-module and R is noetherian), the following are equivalent. [4]

Moreover, if R is a noetherian integral domain, then, by Nakayama's lemma, these conditions are equivalent to

  • The dimension of the k ( p ) >)> –vector space M ⊗ R k ( p ) k(>)> is the same for all prime ideals p >> of R, where k ( p ) >)> is the residue field at p >> . [5] That is to say, M has constant rank (as defined below).

Let A be a commutative ring. If B is a (possibly non-commutative) A-algebra that is a finitely generated projective A-module containing A as a subring, then A is a direct factor of B. [6]

Rank Edit

A basic motivation of the theory is that projective modules (at least over certain commutative rings) are analogues of vector bundles. This can be made precise for the ring of continuous real-valued functions on a compact Hausdorff space, as well as for the ring of smooth functions on a smooth manifold (see Serre–Swan theorem that says a finitely generated projective module over the space of smooth functions on a compact manifold is the space of smooth sections of a smooth vector bundle).

Vector bundles are locally free. If there is some notion of "localization" which can be carried over to modules, such as the usual localization of a ring, one can define locally free modules, and the projective modules then typically coincide with the locally free modules.

The Quillen–Suslin theorem, which solves Serre's problem, is another deep result: if K is a field, or more generally a principal ideal domain, and R = K[X1. Xn] is a polynomial ring over K, then every projective module over R is free. This problem was first raised by Serre with K a field (and the modules being finitely generated). Bass settled it for non-finitely generated modules and Quillen and Suslin independently and simultaneously treated the case of finitely generated modules.

Since every projective module over a principal ideal domain is free, one might ask this question: if R is a commutative ring such that every (finitely generated) projective R-module is free, then is every (finitely generated) projective R[X]-module free? The answer is no. A counterexample occurs with R equal to the local ring of the curve y 2 = x 3 at the origin. Thus the Quillen-Suslin theorem could never be proved by a simple induction on the number of variables.

2. Third-wave extended mind and predictive processing

Thanks to John Schwenkler for the invitation to guest-blog this week about our book Extended Consciousness and Predictive Processing: A Third-Wave View (Routledge, 2019):

Where does your (conscious) mind stop, and the rest of the world begin? We defend what has come to be called a “third-wave” account of the extended mind. In this post we will aim to give you a sense of what the debate is about within the extended mind community. Second we will sketch our third-wave interpretation of the predictive processing theory of the mind. We don’t consider how the third wave perspective on the extended mind might be criticised, an issue we take up in later posts.

The “wave” terminology is due to Sutton (2010), and is used to distinguish the following three lines of argument for the extended mind.

First-wave extended mind is good old fashioned role functionalism, associated either with common- sense functionalism (Clark & Chalmers 1998) or with psychofunctionalism (Wheeler 2010). First-wave theorists (as we saw in our first post) argued for the extension of minds into the world on the basis of the functional equivalence of elements located internally and externally to the individual. If those elements make similar causal contributions in the guidance of a person’s behaviour, they should be treated equally. More specifically, we shouldn’t exclude the external element from being a part of a person’s mind simply on the basis of their location outside of the biological body.

Second-wave arguments turn on the different but complementary functional contributions of tools and technologies as compared with the biological brain. Thus notional systems for doing mathematics for instance, complement the brain’s inner modes of processing, resulting in the transformation of the mathematical reasoning capacities of individuals, groups and lineages. Second-wave arguments may seem to suggest a picture in which internal neural processes have their own proprietary functional properties that get combined with public systems of representation such as systems of mathematical notation that likewise have their own fixed functional properties. Something genuinely new then emerges (e.g. capacities for mathematical reasoning) when these elements with their own self-standing functional properties get combined or functionally integrated.

Third-wave arguments are in agreement with the second-wave in taking material culture to be transformatory of what humans can do as thinkers. However, the third-wave view takes this transformatory process to be reciprocal and ongoing. Human minds continually form over time through the meshing together of embodied actions, material tools and technologies, and cultural norms for the usage of these tools and technologies. Individual agents are “dissolved into peculiar loci of coordination and coalescence among multiple structured media” (Sutton 2010, p.213). Control and coordination is distributed over and propagated through the media taken up in cultural patterns of activity. The constraints (the local rules) that govern the interactions between the components (internal and external) of extended cognitive systems need not all arise from within the biological organism. Some of the constraints may originate in social and cultural practices, in “the things people do in interaction with one another” (Hutchins 2011, p. 4). The boundaries separating the individual from its environment and from the collectives in which the individual participates are “hard won and fragile developmental and cultural achievements” (Sutton 2010, p. 213).

Unlike the previous waves in theorising the extended mind, a commitment to extended consciousness falls naturally out of third-wave arguments for the extended mind. We follow Susan Hurley in thinking of the material realisers of consciousness as extended dynamic singularities (Hurley 1998, 2010). Hurley uses this term to refer to a singularity in “the field of causal flows characterised through time by a tangle of multiple feedback loops of varying orbits”. Such causal flows form out of the organism’s looping cycles of perception and action, where the complex tangle of feedback loops is closed by the world. The extended dynamic singularity she says “is centred on the organism and moves around with it, but it does not have sharp boundaries” (Hurley 1998, p.2). We depart from Hurley however in allowing for a decentering of extended dynamic singularities. As the cognitive anthropologist Ed Hutchins notes, some systems “have a clear centre, while others have multiple centres, or no centre at all” (Hutchins 2011, p. 5). The propagation of activity across various media is coordinated by a ‘lightly equipped human’ working (sometimes) in groups, and always embedded in cultural practices.

Third-wave arguments thus highlight the need for rethinking the metaphysics within which arguments for extended minds are developed. Unlike the standard notions of constitution, realisation or composition, all of which are atemporal or synchronic determination relations, we propose to understand such metaphysical determination relations in temporal or diachronic terms. Cognitive processes are “creatures of time” (Nöe 2006), i.e., they are dependent for their existence on temporal unfolding over various media: some neural or bodily, others involving other people and the resources provided by an environment shaped by our cultural activities and patterns of practice. Extended minds are constituted diachronically. We summarise four key tenets of the third-wave view in the following table:

Key Tenets of Third-Wave Extended Mind

1. Extended Dynamic Singularities: some cognitive processes are
constituted by causal networks with internal and external orbits
comprising a singular cognitive system.
2. Flexible and Open-Ended Boundaries: the boundaries of mind are not
fixed and stable but fragile and hard-won, and always open to
3. Distributed Cognitive Assembly: the task and context-sensitive assembly of cognitive systems is driven not by the individual agent but by a nexus of constraints, some neural, some bodily, and some environmental
(cultural, social, material).
4. Diachronic Constitution: Cognition is intrinsically temporal and
dynamical, unfolding over different but interacting temporal scales of behavior.

In our book we show how to interpret the predictive processing theory through the lens of third-wave extended mind. Predictive processing casts agents as generative models of their environment. A generative model is a probabilistic structure that generates predictions about the causes of sensory stimuli. We argue the ongoing tuning and maintenance of the generative model by active inference entails the dynamic entanglement of the agent and environment. In the following three posts we will put this third-wave perspective on predictive processing to work to argue for three theses defended in our book:

Post 3: The Markov blanketed mind: There is no single, fixed and permanent boundary separating the inner conscious mind from the outside world. The boundary separating conscious beings from the outside world is fluid and actively constructed through embodied activity.

Post 4: Seeing what you expect to see: The predictive processing theory sometimes claims that perceptual experience should be thought of as controlled hallucination. The contribution of the world is to provide a check on the brain’s predictive processes. We argue by contrast that predictive processing that generates conscious experience cannot be unplugged from the world but is exploratory, active and world-involving.

Post 5: The diachronic constitution of extended consciousness: Adopting our third-wave perspective on predictive processing entails a new metaphysics of the constitution of conscious experience as diachronic, not synchronic.

Clark, A., and Chalmers, D. (1998). The extended mind. Analysis, 50, 7-19.

Hurley, S.L. (1998). Consciousness in Action. Cambridge, MA: Harvard University Press.

Hutchins, E. (2011). Enculturating the supersized mind. Philosophical Studies, 152, 437446.

Nöe, A. (2006). Experience the world in time. Analysis, 66(1), 26-32

Sutton, J. (2010). Exograms and interdisciplinarity: History, the extended mind, and the civilizing process. In R. Menary (Ed.), The Extended Mind (pp. 189225). Cambridge, MA: The MIT Press.

Wheeler, M. (2010). In defense of extended functionalism. In R. Menary (ed.), The Extended Mind (pp. 245-270). Cambridge, MA: The MIT Press.



We assume that when an individual chooses a destination, like the radiation model 32 and the OPS model 42 , she/he first evaluates the benefit of the location’s opportunities 43 where the benefit is randomly chosen from a distribution p(z). After that, the individual comprehensively compares the benefits of the origin, the destination and the intervening opportunities and selects a location as the destination. To characterize the behavior of an individual comprehensive comparison of the benefits of the locations, we use two parameters α and β. Parameter α reflects the behavior of the individual’s tendency to choose the destination whose benefit is higher than the benefits of the origin and the intervening opportunities. Parameter β reflects the behavior of the individual’s tendency to choose the destination whose benefit is higher than the benefit of the origin, and the benefit of the origin is higher than the benefits of the intervening opportunities. According to the above assumption, the probability that location j is selected by the individual at location i is

where mi is the number of opportunities at location i, mj is the number of opportunities at location j, sij is the number of intervening opportunities 30 (i.e., the sum of the number of opportunities at all locations whose distances from i are shorter than the distance from i to j), (_<_+alpha cdot _>(z)) is the probability that the maximum benefit obtained after mi + αsij samplings is exactly z, (_<eta cdot _>( < z)) is the probability that the maximum benefit obtained after βsij samplings is less than z, (_<_>( > z)) is the probability that the maximum benefit obtained after mj samplings is greater than z, α and β are both non-negative and α + β ≤ 1.

Since Prx( < z) = p(<z) x , we obtain

Equation (1) can be written as

Then, the probability of the individual at location i choosing location j is

Further, if we know the total number of individuals Oi who travel from location i, the flux Tij from location i to location j can be calculated as

This is the final form of the model and we name it the universal opportunity (UO) model.

The α and β parameters in the UO model reflect the two behavioral tendencies of the individual when choosing potential destinations (where the opportunity benefit is higher than the benefit of the origin). From Eq. (3), we can see that the larger the value of parameter α, the greater the probability that distant potential destinations will be selected by the individual. We name this behavioral tendency the exploratory tendency. On the other hand, the larger the value of parameter β, the greater the probability that near potential destinations will be selected by the individual. We name this behavioral tendency the cautious tendency. We choose average travel distance and normalized entropy as two fundamental metrics to discuss the influence of two parameters α and β on individual destination selection behavior. The average travel distance reflects the bulk density of individual destination selection 44,45,46,47 , and normalized entropy reflects the heterogeneity of individual destination selection 48 . As shown in Fig. 1, the two fundamental metrics have the same regularities with a change in two parameters, whether the number of destination opportunities is a uniform or random distribution. When α = 0, β = 1, the average travel distance is the shortest, and the normalized entropy value is the smallest when α = 0, β = 0, the average travel distance is the longest, and the normalized entropy value is the largest. From the definitions of the two parameters, we can easily explain the reasons for the regularities. When α is closer to 0, β is closer to 1, the individual is more cautious, and the probability of choosing near potential destinations is higher, so the shorter the average travel distance and the stronger the heterogeneity. When α is closer to 1, β is closer to 0, the individual is more exploratory, and the probability of choosing distant potential destinations is higher, so the average distance is increased while the heterogeneity is decreased. When α and β are both closer to 0, the individual attaches more importance to the benefit that the location brings to him/her and does not care about the order of locations, so the longer the average travel distance and the stronger the homogeneity.

Average travel distance and normalized entropy versus different parameter combinations. (a,b) Average travel distance and normalized entropy values corresponding to different parameter combinations. Here, the number of destination opportunities is a uniform distribution. (c,d) Same average travel distance and normalized entropy values as in (a,b), but the number of destination opportunities is a random distribution.

Moreover, when α and β take extreme values (i.e., the three vertices of the triangle in Fig. 1), we can derive three special human mobility models. When α = 0, β = 0, we name this model the opportunity only (OO) model (see details in Supplementary Information, The derivation of the OO model). In this model, the individual chooses the location whose benefit is higher than the benefit of the origin. Then, the probability of the individual at location i choosing location j as the destination is

When α = 1, β = 0, our model can be simplified to the OPS model, in which the individual chooses the location whose benefit is higher than the benefit of the origin and the benefits of the intervening opportunities (see details in Supplementary Information, The derivation of the OPS model). Then, the probability of the individual at location i choosing location j as the destination is

When α = 0, β = 1, our model can be simplified to the radiation model, in which the individual chooses the location whose benefit is higher than the benefit of the origin and the benefits of the intervening opportunities are lower than the benefit of the origin (see details in Supplementary Information, The derivation of the radiation model). Then, the probability of the individual at location i choosing location j as the destination is

From Eqs. (6)–(8), we can see that the OO model, the OPS model and the radiation model are all special cases of our UO model.


We use fourteen empirical data sets, including commuting trips between United States’ counties (USC), commuting trips between the provinces of Italy (ITC), commuting trips between the subregions of Hungary(HUC), freight between Chinese cities (CNF), internal job hunting in China (CNJ), internal migrations in the US (USM), intercity travels in China (CNT), intercity travels in the US (UST), intercity travels in Belgium (BLT), intracity trips in Suzhou (SZT), intracity trips in Beijing(BJT), intracity trips in Shenzhen (SHT), intracity trips in London (LOT) and intracity trips in Berlin (BET) (see Methods), to validate the predictive ability of the UO model. We first extract the flux Tij from location i to location j from the data set and obtain the real mobility matrix. Then, we exploit the Sørensen similarity index 38 (SSI, see Methods) to calculate the similarity between the real mobility matrix and the mobility matrix predicted by the UO model under different parameter combinations. The results are shown in Fig. 2. Figure 2o shows the optimal values of the parameter α and β corresponding to the highest SSI for the fourteen data sets.

Results for empirical data sets. (an) We exploit SSI to calculate the similarity between the real mobility matrix and the predicted mobility matrix under different parameter combinations for the fourteen data sets. Here, the color bar represents the SSI, where a dark red (blue) dot indicates a higher (lower) SSI. (o) The optimal values of the parameters α and β correspond to the highest SSI for the fourteen data sets.

It can be seen from Fig. 2a–d that for USC, ITC, HUC and CNF, when α is close to 0 and β is close to 1, the SSI is relatively large. The reason is that for commuting data sets (USC, ITC and HUC), the commuting distance or time is very important for commuters. As a result, most people tend to choose near potential destinations when finding a job based on their place of residence or adjusting their place of residence after finding a job. This cautious destination selection tendency also exists in freight. Freight to far destinations will lead to an increase in transportation costs and a decrease in the freight frequency, which will have a negative impact on freight revenue. Thus, unless the destination opportunity benefit is very high, the individual tends to choose a near destination rather than a far destination for freight. For the migration and job hunting data sets (USM and CNJ), when α is close to 1 and β is close to 0, the SSI is relatively large, as shown in Fig. 2e,f. The reason is that both job seekers and migrants pay more attention to the destination opportunity benefit rather than the distance to the destination. In other words, they are more exploratory but less cautious. Even if a high benefit destination is far away, it will still be selected by individuals with a relatively high probability. The reason is that the distance to the destination has a smaller impact on long temporal scale mobility behaviors, such as migration and job hunting, than on daily commuting behaviors. For intercity travel data sets (CNT, UST and BLT), when α and β are both near the middle of the diagonal line of the triangle, the SSI is relatively large, as shown in Fig. 2g–i. For most people, intercity travel is occasional and not as frequent as commuting. Travelers are less inclined than commuters to choose near potential destinations but they tend to explore distant potential destinations. Thus, the exploratory tendency parameter α of intercity travels is much larger than that of commuting. On the other hand, the importance of the travel cost of intercity travels is higher than that of the cost of migration. Thus, the cautious tendency parameter β of intercity travels is larger than that of migration. For intracity trips data sets (SZT, BJT, SHT, LOT and BET), when α and β are both close to 0, the SSI is relatively large, as shown in Fig. 2j–n. The reason is that compared with the intercity mobility behavior on a large spatial scale, the spatial scale of intracity mobility behavior is small. In this scenario, the individual is not necessarily concerned about the travel distance and focuses more on the benefit that the location will directly bring to him/her. Thus, the optimal values of α and β are both close to 0, as shown in Fig. 2o.

We next compare the predictive accuracy of the mobility fluxes of the UO model with the radiation model, the OPS model and the OO model. In terms of SSI, as shown in Fig. 3 and Table 1, the UO model performs best. However, the radiation model and the OPS model can provide only relatively accurate predictions for some data sets. For example, the radiation model can predict commuting and freight trips relatively accurately but cannot accurately predict other types of mobility. The reason is that the individual tends to choose near potential destinations rather than distant potential destinations in commuting and freight, where travel costs are more important. From Fig. 2o, we can see that for commuting and freight data sets, the optimal parameter β (which reflects the individual’s cautious tendency) of the UO model is close to 1, and the optimal parameter α (which reflects the individual’s exploratory tendency) is close to 0. Therefore, the prediction accuracy of the radiation model in which the individual only chooses the closest potential destination (i.e., α = 0, β = 1) is close to that of the UO model in commuting and freight data sets. However, the prediction accuracy of the radiation model is considerably lower than that of the UO model in job hunting, migration and noncommuting travel data sets. The reason is that the individual is more likely to choose distant potential destinations in these data sets. In these cases, the prediction accuracy of the OPS model, in which the individual tends to choose distant potential destinations, is closer to that of the UO model. We further measure the fluxes predicted by different models compared with the real fluxes and find that the average fluxes predicted by our model are more in agreement with real observations than the other three models (see details in Supplementary Information, Comparison among different models). We also use a frequently used statistical index, named the root mean square error (RMSE), to measure the prediction errors of the UO model and the other three models, and Table 1 lists the results. From the table, we can see that in most cases, the RMSE of the UO model is smaller than that of the other benchmark models, although the RMSE is not the parameter optimization objective of the UO model. These results prove that the three models only capture the individual’s destination selection behavior at a specific spatiotemporal scale. Yet our UO model can accurately describe the individual’s destination selection behavior at different spatiotemporal scales.

Comparing predicting accuracy of the UO model, the radiation model, the OPS model and the OO model in terms of SSI.

17.11 Installing R and Oracle R Enterprise for External Logical SQL Functions

The external Logical SQL functions such as EVALUATE_SCRIPT, FORECAST, and CLUSTER feed input data to the standalone R-process or to Oracle R Enterprise. Therefore, to create analyses that include these functions, you must install either the R or Oracle R Enterprise external engine in your environment.

R is a widely used environment for statistical computing and graphics and can be used with many different datasources including external files or databases. Oracle R Enterprise is installed specifically for use with the Oracle Database, and makes the open source R statistical programming language and environment ready for use by Oracle BI EE.

17.11.1 Installing R and R Packages

To create analyses using the advanced analytics external Logical SQL functions, you must install R and the required R packages. Before You Begin the Installation

You need to install the 3.1.1 R version distributed with Oracle BI. You can find the R installer in the following Oracle BI environment location:

The distributed R installation supports Linux (OLE 6 and OLE 7) and Windows (7 and 8). Installing R and R Packages on UNIX Platforms

Use the procedures in this section to install R and the R packages on UNIX platforms. See "Before You Begin the Installation" for general prerequisite information.

Before you perform the installation, note the following important information and required tasks:

Run as root or using the sudo command. See the README.txt that is included in r-installer.tar.gz for more information.

Locate proxy.txt in the RInstaller folder and edit it to include the proxy server details.

For OLE 7, before you install the Oracle R distribution, you need to install the TexLive and TexInfo RPMs.

The required RPM versions are: texlive-epsf-svn21461.2.7.4-32.el7.noarch.rpm and texinfo-tex-5.1-4.el7.x86_64.rpm.

Download the RPMs and install them using rpm -ivh <rpm_name> .

You must install the RPMs in this specific order: texlive and then texinfo.

To Install R on UNIX Platforms:

Open a command line, navigate to the installer's location, and enter the following to untar and run the distributed R installer:

To Install R Packages on UNIX Platforms

For UNIX platforms, after you have installed R, then from the same command line run the following command to download and install the required R packages (forecast, mvoutlier, randomForest, RJSONIO, and matrixcalc). Running this command also installs the OBIEEAdvancedAnalytics R package. The installer uses the proxy information from proxy.txt to download the packages from CRAN. Installing R and R Packages on Windows

Use the procedures in this section to install R and the R packages on Windows. See "Before You Begin the Installation" for general prerequisite information.

Before you perform the installation, note the following important information and required tasks:

Locate proxy.txt in the RInstaller folder and edit it to include the proxy server details.

Before you can install R on Windows, you must confirm that your Windows environment contains the wget and the unzip utilities. You can download these utilities from the following locations:

Locate and open NQSConfig.INI. In the ADVANCE_ANALYTICS_SCRIPT section, update the R_EXECUTABLE_PATH property to point to the R executable path. For example:

R_EXECUTABLE_PATH = "C:/Program Files/R/R-3.1.1/bin/x64/R"

Using a zip utility, unzip r-installer.tar.gz.

If you have not already done so, then go to the RInstaller folder, locate proxy.txt, and edit it to include the proxy server details.

To run the installer, go to the RInstaller folder where you unzipped r-installer.tar.gz, locate and then execute './Rinstaller.bat install' in a command line session.

To Install R Packages on Windows:

After you have installed R, then from the same command line run the following command to download and install the required R packages (forecast, mvoutlier, randomForest, RJSONIO, and matrixcalc). Running this command also installs the OBIEEAdvancedAnalytics R package.

17.11.2 Installing Oracle R Enterprise and Required R Packages on the Oracle Database

Oracle BI EE use the R engine included in Oracle R Enterprise instead of R. Oracle BI EE can use the Oracle R Enterprise colocation option, where the data can reside in the Oracle R Enterprise database. (In the non-colocation option, the data does not reside in the Oracle R Enterprise database.)

See "Before You Begin the Installation" for more information. If you are using databases other than the Oracle Database, then see "Installing R and R Packages" for more information. Before You Begin the Installation

Oracle BI EE requires that you install Oracle R Enterprise version 1.4 or 1.4.1. See Table 17-11 for more information.

Watch the video: ActInfLab Livestream # Information flow in context-dependent hierarchical Bayesian inference (December 2021).