Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
IBM SPSS Modeler Cookbook

You're reading from   IBM SPSS Modeler Cookbook If you've already had some experience with IBM SPSS Modeler this cookbook will help you delve deeper and exploit the incredible potential of this data mining workbench. The recipes come from some of the best brains in the business.

Arrow left icon
Product type Paperback
Published in Oct 2013
Publisher Packt
ISBN-13 9781849685467
Length 382 pages
Edition 1st Edition
Languages
Concepts
Arrow right icon
Toc

Table of Contents (11) Chapters Close

Preface 1. Data Understanding FREE CHAPTER 2. Data Preparation – Select 3. Data Preparation – Clean 4. Data Preparation – Construct 5. Data Preparation – Integrate and Format 6. Selecting and Building a Model 7. Modeling – Assessment, Evaluation, Deployment, and Monitoring 8. CLEM Scripting A. Business Understanding Index

Historical introduction to scripting

By the time Clementine Version 4 was released in 1997, the workbench had gained substantial market traction. Its revolutionary visual programming interface had enabled a more business-focused approach to analytics than ever before—all the major families of algorithms were represented in an easy-to-use form, ODBC had enabled integration with a comprehensive range of data, and commercial partners were busy rebadging Clementine to reach a wider audience through new market channels.

The workbench lacked one major kind of functionality, that of automation, to enable the embedding of data mining within other applications. It was therefore decided that automation would form the centre piece of Version 5, and it would be provided by two major features: batch mode and scripting. Batch mode enabled running the workbench without the user interface so that streams could be run in the background, could be scheduled to run at a given time or at regular intervals, and could be run as part of a larger application. Scripting enabled the user to gain automated control of stream execution, even without the user being present; this was also a prerequisite for any complex operation executed in batch mode.

The motivation behind scripting was to provide a number of capabilities:

  • Gain control of the order of stream execution where this matters, that is, when using the Set Globals node
  • Automate repetitive processes, for example, cross-validation or the exploration of many different sets of fields or options
  • Remove the need for user intervention so that streams could run in the background
  • Manipulate complex streams, for example, if the need arose to create 1000 different Derive nodes

These motives led to an underlying philosophy of scripting, that is, scripts replace the user, not the stream. This means that the operations of scripting should be at the same level as the actions of the user, that is, they would create nodes and link them, control their settings, execute streams, and save streams and models. Scripts would not be used to implement data manipulation or algorithms directly; these would remain in the domain of the stream itself. This reflects a fundamental fact about technologies—they are defined by what they cannot do as by what they can. These principles are not inflexible, for example, cross-validation might be considered as part of an algorithm but was one of the first scripts to be written; however, they guided the design of the scripting language. A consequence of this philosophy was that there could be no interaction between script and data; the restriction was lifted only later with the introduction of access to output objects.

A number of factors influenced the design of the scripting language in addition to the above philosophy:

  • In line with the orientation towards nontechnical users, the language should be simple
  • The timescale for implementation was short, so the language should be easy to implement
  • The language should be familiar, and so should use existing programming concepts and constructs, and not attempt to introduce new ones

These philosophical and practical constraints led to a programming language influenced by BASIC, with structured features taken from POP-11 and an object-oriented approach to nodes taken from Smalltalk and its descendants.

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image