Introduction
Welcome to The DBL Programming Language, your introductory guide to understanding and mastering DBL. DBL, as a progression of DIBOL, brings together a rich history of business-oriented design with modern software development practices. This programming language allows you to create robust, efficient software that retains a high degree of portability and backward compatibility.
DIBOL, which stands for Digital’s Business Oriented Language, was a general-purpose, procedural, and imperative programming language initially developed and marketed by Digital Equipment Corporation (DEC) in 1970. Designed primarily for use in management information systems (MIS) software development, DIBOL was deployed across a range of DEC systems, from the PDP-8 running COS-300 to the PDP-11 with COS-350, RSX-11, RT-11, and RSTS/E, and eventually on VMS systems with DIBOL-32. DIBOL development by DEC effectively ceased after 1993.
DBL, developed by DISC (now Synergex), emerged as a successor to DIBOL. After an agreement between DEC and DISC, DBL replaced DIBOL on VMS, Digital UNIX, and SCO Unix. It has since been further developed and supported by Synergex, adding support for object-oriented and Functional programming styles.
DBL strikes a balance between the business functionality offered by DIBOL and the incorporation of modern programming paradigms. By integrating procedural syntax, reminiscent of its DIBOL roots, with contemporary features like object-oriented programming, DBL allows programmers to manage their organization’s complex business logic. Moreover, DBL’s extensive support for various file access methods and data types makes it an excellent choice for data-intensive applications.
With DBL, you gain control over intricate details without the complexity typically associated with such precision, allowing you to focus on building reliable, cross-platform business software. This book will guide you through the versatility of DBL and how it carries forward the legacy of DIBOL into the realm of modern programming.
Naming
In your journey to learn DBL, you’ll encounter several names for this programming language and its associated tools. This diversity of terminology is an unfortunate side effect of its longevity. Originating from DIBOL, DBL has grown and adapted to accommodate various runtimes and environments.
One crucial aspect of DBL is its standalone runtime, often referred to as Traditional Synergy or Traditional DBL. This standalone variant serves as a comprehensive environment for executing DBL applications independently. The robustness and platform independence offered by Traditional DBL allow for efficient and reliable operation of DBL software written nearly 50 years ago.
Further enhancing DBL’s versatility is its ability to compile natively to .NET. This integration, frequently referred to as Synergy .NET or DBL running on .NET, permits seamless interaction with .NET and its wide ecosystem. This adaptability ensures that DBL applications can leverage the power and scope of .NET’s extensive libraries and tools.
Whether you’re using DBL in its traditional form or harnessing the power of .NET, the essence of DBL remains consistent: offering robust features and superior reliability for diverse programming tasks. The different names for DBL merely reflect its multifaceted nature and ability to operate in varied environments.
Who This Book Is For
This book assumes that you’ve written code in another programming language but doesn’t make any assumptions about which one. We’ve tried to make the material broadly accessible to those from a wide variety of programming backgrounds. While we do not delve into the fundamentals of what programming is or the general philosophy behind it, we place significant emphasis on explaining the intricacies of real world DBL. This includes detailed discussions on when and why to use various features of the language, as well as shedding light on common patterns and practices prevalent in the DBL community. However, if you are entirely new to the world of programming, you might find more value in a resource specifically tailored as an introduction to programming concepts. This book aims to serve as a bridge between your existing programming knowledge and the specificities and nuances of DBL.
How to Use This Book
In general, this book assumes that you’re reading it in sequence from front to back. Later chapters build on concepts in earlier chapters, and earlier chapters might not delve into details on a particular topic but will revisit it in a later chapter. Some chapters can be skipped if your codebase doesn’t use the features that they cover. (An example is the “Persisting Your Data: ISAM” chapter.) However, you should still read a chapter that covers newer .NET-specific features, even if your codebase doesn’t currently use those features.
Special Thanks
This book is inspired by, loosely structured as, and sometimes copied from the Rust Book.
Getting Started
Let’s start your DBL journey! There’s a lot to learn, but every journey starts somewhere. In this chapter, we’ll discuss
- Installing Synergy/DE (the DBL development tools) on Windows
- Writing a program that prints “Hello World”
- Using MSBuild for a .NET DBL application
Installation
We’ll start our DBL journey by installing development tools on a 64-bit Windows machine.
Step 1: Install Visual Studio 2022
To start, you need to install Visual Studio 2022. This is required for the MSBuild-based .NET Framework and Traditional DBL build systems. When developing for .NET 6+, this is not required but is still recommended for a better development experience.
-
Download Visual Studio 2022: Visit the official Visual Studio download page and download the installer for Visual Studio 2022.
-
Run the installer: Open the downloaded installer and proceed with the installation.
-
Choose workloads: During the installation process, select the .NET desktop development workload. This workload includes necessary tools and libraries for DBL development.
Step 2: Download Synergy/DE installers
Next, you need to download the Synergy/DE (SDE) installers. These provide command line tools for developing, compiling, linking, running, and licensing your DBL applications.
-
Visit the Synergex Resource Center: Go to https://resources.synergex.com/SiteDownloads.
-
Log in: Use your Resource Center account to log in. If you don’t have an account, you’ll need to contact Synergex to get one.
-
Download installers: Download the three latest Windows installers:
- Synergy/DE 64-bit (104)
- Synergy/DE 32-bit (101)
- SDI 32-bit (101)
Step 3: Install Synergy/DE
Now, install Synergy/DE on your machine. First run the downloaded Synergy/DE 64-bit (104) installer, and then run the Synergy/DE 32-bit (101) installer. Follow the on-screen instructions and pay close attention to step 4 below to complete the installations.
Step 4: Configure licensing
During the installation of Synergy/DE, you’ll need to configure the licensing.
-
License configuration prompt: If this is the first time Synergy/DE has been installed on your machine, you will be prompted for license configuration.
-
Select licensing type:
- Choose License Server or Stand-alone (Local Licensing) if you have a unique licensee name provided by Synergex.
- Choose License Client if you need to specify the name of a license server.
-
Enter license details:
- For License Server or Stand-alone: Enter the unique licensee name given by Synergex.
- For License Client: Enter the name of your license server. If you install 32-bit Synergy/DE on a machine with 64-bit Synergy/DE network server licensing, the server name for the 32-bit installation defaults to the 64-bit machine’s name.
Step 5: Install Synergy/DE Visual Studio Integration
Now install Synergy/DE Integration for Visual Studio (SDI) by running the SDI 32-bit (101) installer. This will integrate Synergy/DE with Visual Studio 2022.
Step 6: Verify the installation
After installation, it’s good practice to verify that everything is set up correctly.
- Verify licensing: Open a command prompt and run
"%SYNERGYDE64%\dbl\dblvars64.bat" && lmu
. This should display the licensing information for your machine.
Conclusion
You have now successfully set up the DBL programming environment on your Windows machine using Visual Studio 2022. In the future when someone says to go to a DBL command prompt or ensure DBL environment variables have been configured in your command prompt, you can just run "%SYNERGYDE64%\dbl\dblvars64.bat"
, which will bring in the environment variables required to build/run DBL code in 64-bit. If you need the 32-bit version instead, you can just replace the “64” parts in that command with “32” and run "%SYNERGYDE32%\dbl\dblvars32.bat"
. Now we can move on to creating our first DBL program.
Hello World
Now that you’ve installed Visual Studio, Synergy/DE, and SDI, you’re ready to create your first program in DBL.
Creating a project directory
Start by making a directory to store your DBL code. It doesn’t matter where your code resides, but for the exercises and projects in this book, we suggest making a “projects” directory in your home directory and keeping all your projects there.
Open a terminal, move to your home directory, and enter the following commands to make a projects directory and a directory within that for the project you’ll create for your first DBL program, “Hello World!”
mkdir "%USERPROFILE%\projects"
cd /d "%USERPROFILE%\projects"
mkdir HelloWorld
cd HelloWorld
Add Traditional DBL tools to PATH
Whether you want to build and run 32-bit applications or 64-bit applications, you’ll need to run one of the following commands at a Windows command prompt to add the appropriate Synergy DBL runtime to your PATH. If you want to build and run 32-bit applications, run "%SYNERGYDE32%\dbl\dblvars32.bat"
. If you want to build and run 64-bit applications, run "%SYNERGYDE64%\dbl\dblvars64.bat"
.
Write the “Hello World” program
Next, make a new source file and call it HelloWorld.dbl. DBL files don’t need to end with .dbl
, and your company may have a different standard file extension, but it’s generally easier just to use .dbl
. If your filename contains more than one word, the convention in this book will be to begin each word with a capital letter (i.e., PascalCase). For example, use HelloWorld.dbl rather than helloworld.dbl for the source file you are creating here.
Now open the HelloWorld.dbl file you just created and add the following code:
proc
Console.WriteLine("Hello World")
Anatomy of a DBL program
Let’s take a closer look at the code you just entered. Because the first line of the code is proc
, this line serves as an implicit main
declaration. It tells DBL that you’re defining a main
routine, which is the program’s entry point and a collection of statements that perform tasks. The second line, Console.WriteLine("Hello World")
, is an expression that prints the string “Hello World” to the screen. The Console.WriteLine
part of the expression indicates that you want to use the WriteLine
method defined in the Console
class.
You’ll learn more about classes, methods, and how to define your own methods later in this book. In later examples we’ll sometimes use an implicit main
, as we did here, and we’ll sometimes explicitly declare main
. For now, an implicit declaration is more convenient. Your codebase likely has it both ways, and you can use either.
Compile the program
Before running a DBL program, you must compile and link it using the DBL compiler and linker. First, we’ll compile by entering the dbl
command and passing it the name of your source file, like this:
dbl HelloWorld.dbl
If you get a message indicating that “dbl isn’t recognized,“ run dblbvars64, as instructed in Add Traditional DBL tools to PATH (above), and then run
dbl HelloWorld.dbl
.
Link your application for Traditional DBL
The compile step above will produce an .obj
file. This is an object file that contains the compiled code for your application. You’ll need to link this object file to produce a runnable .dbr
file. You can do this by executing
dblink HelloWorld
Run your application with the Traditional DBL runtime
You can run the .dbr
file on Windows in a semi-GUI fashion by executing this:
dbr HelloWorld.dbr
Or run it directly in the console by executing this:
dbs HelloWorld.dbr
In either case, you should see the message “Hello World” as well as the following, which indicates that the program completed without error:
%DBR-S-STPMSG, STOP %DBR-I-ATLINE, At line 2 in routine HELLOWORLD (HelloWorld.dbl)
When you click the OK button in the pop-up window that opens at this point, the message window and the program close.
That’s enough manual building and running of DBL applications. Let’s move on to building a system with MSBuild.
Hello MSBuild
In this section, we’ll walk through the steps to create a new “Hello World” project using the .NET command-line interface (CLI) and a custom template named “SynNetApp”. The .NET CLI is a powerful tool that allows you to create, build, and run .NET projects from the command line. Don’t worry if your company isn’t currently targeting .NET. The important parts of this walkthrough show how projects can be built and what sort of artifacts you can expect as output.
Open your command-line interface
First, open a Windows Command Prompt or a Terminal window. You’ll use this to run the .NET CLI commands.
Install Synergy project templates
Before creating a project, you need to ensure that the Synergex.Projects.Templates package is installed on your system. If you haven’t installed it yet, you can do so by running the following command:
dotnet new install Synergex.Projects.Templates
This command will download and install all the DBL project templates you might want to use, including SynNETApp.
If you get an error like “Synergex.Projects.Templates could not be installed, the package does not exist”, the problem could be that the NuGet feed (https://api.nuget.org/v3/index.json) is not set up as a package source on your system. See Microsoft documentation on package sources for information on adding this package source.
Create a new project
Now, you’re ready to create a new project. Navigate to the directory where you want to create your project and run
dotnet new synNETApp -n HelloMSBuild
SynNETApp
is the name of the template used to create a DBL console application for .NET, and -n HelloMSBuild
specifies the name of your new project. You can replace HelloMSBuild
with whatever project name you prefer.
Navigate to your project directory
Once the project is created, navigate to the project directory:
cd HelloMSBuild
Replace HelloMSBuild
with your project name if you chose a different one.
Explore the project structure
Take a moment to explore the newly created project structure. You should see two files:
Program.dbl
contains the entry point of the application. This is where the application starts executing.HelloMSBuild.synproj
contains the project definition in an MSBuild-compatible XML format. This file defines the project’s name, source files, options, and dependencies.
Let’s dig into the created HelloMSBuild.synproj file by opening it in a text editor. It should look something like this:
<Project Sdk="Microsoft.NET.Sdk" DefaultTargets="restore;Build">
<PropertyGroup>
<OutputType>Exe</OutputType>
<TargetFramework Condition="'$(TargetFrameworkOverride)' == ''">net6.0</TargetFramework>
<TargetFramework Condition="'$(TargetFrameworkOverride)' != ''">$(TargetFrameworkOverride)</TargetFramework>
<DefaultLanguageSourceExtension>.dbl</DefaultLanguageSourceExtension>
<EnableDefaultItems>false</EnableDefaultItems>
</PropertyGroup>
<ItemGroup>
<Compile Include="Program.dbl" />
</ItemGroup>
<ItemGroup>
<PackageReference Include="Synergex.SynergyDE.Build" Version="23.*" />
<PackageReference Include="Synergex.SynergyDE.synrnt" Version="12.*" />
</ItemGroup>
</Project>
Root element and SDK reference
Take a look at the first line, which contains the root element and specifies an SDK and default target:
<Project Sdk="Microsoft.NET.Sdk" DefaultTargets="restore;Build">
<Project>
is the root element of the MSBuild file. It contains configuration data for the project.Sdk="Microsoft.NET.Sdk"
specifies that this project uses the .NET SDK. This SDK provides a set of standard targets, properties, and items.DefaultTargets="restore;Build"
sets the default targets to be executed when MSBuild runs this file. In this case, it will first run therestore
target (to restore NuGet packages) and then theBuild
target.
PropertyGroup
The next element, PropertyGroup
, defines a group of properties for the project:
<PropertyGroup>
<OutputType>Exe</OutputType>
<TargetFramework Condition="'$(TargetFrameworkOverride)' == ''">net6.0</TargetFramework>
<TargetFramework Condition="'$(TargetFrameworkOverride)' != ''">$(TargetFrameworkOverride)</TargetFramework>
<DefaultLanguageSourceExtension>.dbl</DefaultLanguageSourceExtension>
<EnableDefaultItems>false</EnableDefaultItems>
</PropertyGroup>
<OutputType>Exe</OutputType>
specifies the output type of the project, which in this case is an executable file.<TargetFramework>
sets the target framework for the project. This project targets .NET 6.0 by default, but it can be overridden with a different framework version ifTargetFrameworkOverride
is specified.<DefaultLanguageSourceExtension>.dbl</DefaultLanguageSourceExtension>
sets the default source file extension for the language used in this project, which is.dbl
.<EnableDefaultItems>false</EnableDefaultItems>
disables the default behavior of including certain files in the build based on conventions. This may change in the future, but currently there are issues with automatic includes for DBL files.
ItemGroup for compile
The next element, ItemGroup
, is used to define a group of items that represent inputs into the build system:
<ItemGroup>
<Compile Include="Program.dbl" />
</ItemGroup>
The <Compile Include="Program.dbl" />
line specifies that Program.dbl
is to be compiled.
ItemGroup for PackageReferences
The next ItemGroup
section is for NuGet package references:
<ItemGroup>
<PackageReference Include="Synergex.SynergyDE.Build" Version="23.*" />
<PackageReference Include="Synergex.SynergyDE.synrnt" Version="12.*" />
</ItemGroup>
- The first
<PackageReference>
line adds a NuGet package reference forSynergex.SynergyDE.Build
, a package that includes the DBL compiler. The “23.*” notation indicates that the project should use the latest available version of the package with a major version number of 23. The asterisk wildcard allows the project to use a minor or patch version under major version 23. - Similarly, the next
<PackageReference>
line adds a reference toSynergex.SynergyDE.synrnt
, the runtime support package, and includes a version wildcard for major version 12.
Customize the “Hello World” message
Open Program.dbl
in your favorite text editor. You’ll see a line of code that looks something like this:
import System
main
proc
Console.WriteLine("Hello World")
endmain
Customize the message by changing it to something like this:
Console.WriteLine("Hello from the DBL book!")
Run your application
Finally, it’s time to run your application. Go back to your command line interface and execute
dotnet run
This command will compile and execute your application. You should see your custom message displayed in the console. Congratulations! You’ve successfully created and run a new .NET application using the synNETApp template.
Now let’s set aside the very helpful .NET CLI, and try the build and run steps manually.
Build your application with MSBuild
So far you’ve seen dotnet CLI act as a wrapper around MSBuild. Now let’s try to build the project using MSBuild directly. From the command line, run
msbuild HelloMSBuild.synproj
This command will compile your application and produce an executable file named HelloMSBuild.exe
under the bin/Debug
directory. Depending on the version of .NET you are running, you should see a folder such as net6.0
or net8.0
. Once you navigate to that inner directory, you can run the executable directly, as we’ll see in the next step.
Run your application directly
Navigate to the inner directory created in the previous step (e.g., bin/Debug/net6.0
). From here, run the built executable file with the following command:
HelloMSBuild.exe
You should see your custom “Hello World” message displayed in the console just as before. We prefer using the .NET CLI in this book, but it’s good to know what is going on under the hood. Next, we’ll move on to something a little more interesting than “Hello World”—a guessing game!
Programming a Guessing Game
Let’s jump into DBL by working through a hands-on project together! This chapter introduces you to a few common concepts by showing you how to use them in a real program. You’ll learn about variables, main functions, terminal I/O, and more. In the following chapters, we’ll explore these ideas in more detail. In this chapter, you’ll just practice the fundamentals.
We’ll implement a classic beginner programming problem, a guessing game. Here’s how it works: the program will generate a random integer between 1 and 100. It will then prompt the player to enter a guess. After a guess is entered, the program will indicate whether the guess is too low or too high. If the guess is correct, the game will print a congratulatory message and exit.
Setting up a new project
To set up a new .NET project, go to the projects directory that you created in Chapter 1 and make a new console app project by using the .NET CLI, like so:
dotnet new SynNETApp -n GuessingGame
cd GuessingGame
The first command, dotnet new
, takes the template name (SynNetApp
) as the first argument and the name of the project (GuessingGame
) as the second argument. The second command changes to the new project’s directory.
As you saw in the Getting Started section, dotnet new
generates a “Hello World” program for you. Check out the Program.dbl file:
import System
main
proc
Console.WriteLine("Hello World")
endmain
Now let’s compile this “Hello World” program and run it in the same step using the dotnet run
command:
dotnet run Program.dbl
The run
command comes in handy when you need to rapidly iterate on a project, as we’ll do in this game, quickly testing each iteration before moving on to the next one.
Open the Program.dbl file. You’ll be writing all the code for your program in this file.
Processing a guess
The first part of the guessing game program will ask for user input, process that input, and check whether the input is in the expected form. To start, we’ll allow the player to input a guess. Replace the contents of Program.dbl with the following:
import System
main
proc
stty(0) ; Enable .NET console input
Console.WriteLine("Guess the number!")
Console.WriteLine("Please input your guess.")
data guess = Console.ReadLine()
Console.WriteLine("You guessed: " + guess)
endmain
Code breakdown
Let’s break down each part of the code, starting with the first line:
import System
import System
tells the compiler to make the contents of the System
namespace implicitly available in this source file. For our use case, System
provides access to fundamental classes for managing input and output (I/O), basic data types, and other essential services. This access is necessary to use the Console
class in the program.
main
proc
main
indicates the starting point of the program. In DBL, main
is a special keyword used to define the entry point of the application. As you’ve seen in the Hello World example, you can skip the main
keyword and just start with proc
. In this example we’re using a fully declared main. proc
signals the transition from the data division to the procedure division. The procedure division contains the statements that perform the tasks of the program.
stty(0) ; Enable .NET console input
DBL has multiple ways to read input from the console. The stty
statement is used to enable the .NET console input. This is necessary for using the Console.ReadLine()
method in the program, and it’s mutually exclusive with other console input methods.
Console.WriteLine("Guess the number!")
Console.WriteLine
is a method from the .NET Console
class that writes a line of text to the standard output stream, which is the console in this case.
The following line outputs the text to prompt the user for a guess:
Console.WriteLine("Please input your guess.")
Similar to the previous line, this line outputs “Please input your guess.” to the console. It’s a prompt for the user to enter their guess.
The next line declares a variable named guess
:
data guess = Console.ReadLine()
data guess
declares a variable named guess
. Because it does not specify the type of the variable, the type will be inferred from the value assigned to it. In this case, it will be a string
. The =
followed by a call to Console.ReadLine()
reads the next line of characters from the standard input stream (the console input in this case) and then stores the result in the variable guess
.
Console.WriteLine("You guessed: " + guess)
This combines “You guessed: “ with the value stored in guess
, displaying the user’s input on the console.
Finally, the endmain
, which is optional, marks the end of the main
procedure:
endmain
The endmain
is optional and marks the end of the main
procedure.
Testing the first part
Let’s test the first part of the guessing game. Run it using dotnet run
:
dotnet run
Guess the number!
Please input your guess.
5
You guessed: 5
At this point, the first part of the game is complete. We’re getting input from the keyboard and then printing it.
Generating a secret number
Next, we need to generate a secret number that the user will try to guess. To make the game fun to play more than once, the number should be different every time. We’ll use a random number between 1 and 100 so the game isn’t too difficult. DBL has a built-in random number facility, the RANDM routine, but since it’s not a super ergonomic function, we’re going to use the Random
class from the System
namespace instead. Let’s start using Random
to generate a random number between 1 and 100. Replace the contents of Program.dbl with the following:
import System
main
proc
stty(0) ; Enable .NET console input
Console.WriteLine("Guess the number!")
data random = new Random()
data randomNumber = random.Next(1, 101) ; Generates a random number between 1 and 100
Console.WriteLine("Please input your guess.")
data guess = Console.ReadLine()
Console.WriteLine("You guessed: " + guess)
Console.WriteLine("The secret number was " + %string(randomNumber))
endmain
First, we’ve added a variable named random
and assigned it a new instance of the Random
class. This is a class that provides a convenient way to generate random numbers. Next we’ve added a variable named randomNumber
and assigned it the result of calling the Next
method on the random
variable. The Next
method takes two arguments; the first is the inclusive lower bound of the random number, and the second is the exclusive upper bound. In this case, we’re passing 1 and 101, so the random number will be between 1 and 100. Finally, we’ve added a line to convert our secret number to be a string, then output it to the console.
Try running the program a few times:
dotnet run
Guess the number!
Please input your guess.
You guessed: 50
The secret number was 37
You should get different random numbers, and they should all be numbers between 1 and 100. Great job!
Comparing the guess to the secret number
Now that we have user input and a random number, we can compare them. Replace the contents of Program.dbl with the following:
import System
main
proc
stty(0)
Console.WriteLine("Guess the number!")
data random = new Random()
data randomNumber = random.Next(1, 101) ; Generates a random number between 1 and 100
Console.WriteLine("Please input your guess.")
data guess = Console.ReadLine()
data guessNumber, int, %integer(guess)
if(guessNumber > randomNumber) then
Console.WriteLine("Too big!")
else if(guessNumber < randomNumber) then
Console.WriteLine("Too small!")
else
Console.WriteLine("Correct!")
endmain
First, we’re calling integer
and passing it the string we read off the console in order to convert it from a string
to an int
. In that same line, we’ve added a new variable, guessNumber
, declared that it’s an int
, and assigned its initial value. Next, we have a block of IF-ELSE statements. The if
statement checks if guessNumber
is greater than randomNumber
. If it is, it prints “Too big!”. The else if
statement checks if guessNumber
is less than randomNumber
. If it is, it prints “Too small!”. Finally, the else
statement is a catch-all that prints “Correct!” if guessNumber
is neither greater than nor less than randomNumber
. If we hadn’t converted guess
to an int
, the compiler wouldn’t allow us to compare with randomNumber
because they would be different types.
If you run the program now, you’ll see something like the following:
dotnet run
Guess the number!
Please input your guess.
55
Too small!
You might be tempted to experiment with inputting something other than a number to see what happens. Go ahead and try it. You’ll see something like this:
dotnet run
Guess the number!
Please input your guess.
dd
Unhandled exception. Synergex.SynergyDE.BadDigitException: Bad digit encountered
at Synergex.SynergyDE.VariantDesc.ToLong()
at Synergex.SynergyDE.SysRoutines.f_integer(VariantDesc parm1, NumericParam parm2)
at _NS_GuessingGame._CL.MAIN$PROGRAM(String[] args)
Not very user-friendly! Let’s add some error handling to make this a better experience for the user. Replace the contents of Program.dbl with the following:
import System
main
proc
stty(0)
Console.WriteLine("Guess the number!")
data random = new Random()
data randomNumber = random.Next(1, 101) ; Generates a random number between 1 and 100
Console.WriteLine("Please input your guess.")
data guess = Console.ReadLine()
try
begin
data guessNumber, i4, %integer(guess)
if(guessNumber > randomNumber) then
Console.WriteLine("Too big!")
else if(guessNumber < randomNumber) then
Console.WriteLine("Too small!")
else
Console.WriteLine("Correct!")
end
catch(ex, @Synergex.SynergyDE.BadDigitException)
begin
Console.WriteLine("Please type a number!")
end
endtry
endmain
We’ve added a try
block around the code that converts the guess
to an int
and compares it to the randomNumber
. We’ve also added a catch
block that catches the BadDigitException
and prints a more user friendly message. We could also have used Int.TryParse
from .NET, but this way you can see some explicit error handling. If you run the program now and enter something other than a number, you’ll see something like this:
dotnet run
Guess the number!
Please input your guess.
dd
Please type a number!
Allowing multiple guesses with looping
Now that we have the basic game working, we can make it more interesting by allowing multiple guesses. To do this, we’ll use a repeat loop. The repeat loop continues until exitloop
is executed.
Replace the contents of Program.dbl with the following:
import System
main
proc
stty(0)
Console.WriteLine("Guess the number!")
data random = new Random()
data randomNumber = random.Next(1, 101) ; Generates a random number between 1 and 100
repeat
begin
Console.WriteLine("Please input your guess.")
data guess = Console.ReadLine()
try
begin
data guessNumber, i4, %integer(guess)
if(guessNumber > randomNumber) then
Console.WriteLine("Too big!")
else if(guessNumber < randomNumber) then
Console.WriteLine("Too small!")
else
begin
Console.WriteLine("Correct!")
exitloop
end
end
catch(ex, @Synergex.SynergyDE.BadDigitException)
begin
Console.WriteLine("Please type a number!")
end
endtry
end
endmain
We’ve added a repeat
block around the entire chunk of code that gets the user’s guess and compares it to randomNumber
. We’ve also made the else
block into a compound statement so we can execute both Console.WriteLine
and exitloop
. Now if you run the program you’ll see something like the following:
dotnet run
Guess the number!
Please input your guess.
55
Too big!
Please input your guess.
25
Too big!
Please input your guess.
13
Too big!
Please input your guess.
6
Too big!
Please input your guess.
3
Correct!
And there you have it! A fully functioning guessing game. You can play it as many times as you want, and it will generate a new random number each time. You can also try to guess the number as many times as you’d like.
Summary
This project was a hands-on way to introduce you to some of the fundamentals of DBL. You learned about variables, looping, error handling, and conditional statements. In the next chapter, we’ll dive deeper into common programming concepts.
Common Programming Concepts
In this chapter, we delve into fundamental concepts prevalent across the spectrum of programming languages and examine their applications within DBL. At their roots, most programming languages share many similarities. While none of the concepts we explore in this chapter are exclusive to DBL, our discussion will be framed in the context of DBL, providing explanations around the conventions associated with these concepts.
In particular, we will explore variables, basic data types, routines, comments, and control flow. These foundational elements constitute the backbone of every DBL program. Gaining an early understanding of these elements will equip you with a robust core knowledge to kickstart your journey in DBL programming.
Multi-line statements
DBL considers the end-of-line character to be a statement terminator. This means that in order to write a multi-line statement, you need to use the line continuation character, which is “&”. The & goes not on the end of the prior line, but instead on the start of the continuing line. This can be an unpleasant surprise for new developers coming from other languages.
Identifiers
Identifiers serve as unique names to identify various elements within the code, such as keywords, variables, routines, and labels. The rules governing the structure of identifiers in DBL are designed to ensure readability and manageability of the code. In Traditional DBL, an identifier can have up to 30 characters, while the .NET environment allows for over 1000 characters, accommodating more descriptive naming conventions. The first character of an identifier in Traditional DBL must be alphabetic (A-Z), whereas in the .NET environment, an underscore (_) is also permissible as the initial character. The subsequent characters can be a mix of alphanumeric characters (A-Z, 0-9), dollar signs ($), or underscores (_), allowing for a combination that can include abbreviations, acronyms, or even certain special characters to create meaningful and informative names.
Keywords
Unlike many other languages, DBL doesn’t use keywords for features existing before its ninth version. This allows you to use terms typically reserved in other languages as variable or function names in DBL. While newer versions have introduced some reserved words, much of DBL’s core functionality doesn’t rely on specific keywords. If you’re used to other languages, this might seem unusual, but it’s part of DBL’s design.
Case sensitivity
DBL is mostly case insensitive. so while you may see keywords or identifiers shown in UPPERCASE letters, the compiler will treat the identifier upPPeR_cASe identically to UPPER_CASE. The only time DBL is case sensitive is when calling OO things or when it needs to decide between two ambiguous identifiers.
Variables
DBL routines have two main divisions: the data division and the procedure division. The data division (which precedes the PROC statement in a DBL routine) is where data items, such as records and groups, are declared and organized. The procedure division, on the other hand, contains the executable code, or the operations performed on these data items.
Historically, the separation between these two divisions was strict, but more recent versions of DBL allow for variable declarations within the procedure division using the DATA keyword. This is similar to the transition from C89 to C99, where variable declarations were allowed within the body of a function. This change has been a welcome addition to DBL, as it allows for more readable and maintainable code.
Records
Within the data division, records are structured data containers that can be either named or unnamed. They can hold multiple related data items of various types, but they differ from the aggregate data structures in other languages in that they represent an instance as well as a type definition. The compiler doesn’t require you to put ENDRECORD at the end of a record, but it is considered good form. Records are considered a top-level declaration within the data division, so while you can nest groups within records, you cannot nest records within records.
Named vs unnamed records
The existence of both named and unnamed records can be a little confusing for developers new to DBL. When should you use one over the other? Named records have two use cases. The first use is code clarity: if you have a lot of fields, grouping them by purpose can make it easier to reason about them. The second (and much more complex) use is when you want to refer to all the data with a single variable. There is a lot to this last use case, so let’s unpack it.
You can see from the above diagram that we’re treating all of the EmployeeRecord data as a single big alpha. This is very common in DBL code, and it’s one of the things that makes I/O very natural in this language. You can write an entire record to disk or send it over the network without any serialization effort. With unnamed records, however, this is not the case. Unnamed records are just a way to group related data. They are not types. You can’t pass them to a routine or return them from a routine. They just group data.
Top-level data division records can be declared with different storage specifiers: stack, static, or local. These specifiers determine the lifespan and accessibility of all variables under them.
“Stack” records and the variables they contain behave like local variables in most other programming languages. They are allocated when the scope they are declared in is entered and deallocated when that scope is exited.
“Static” records and the variables they contain have a unique characteristic. There’s exactly one instance of a static variable across all invocations of its defining routine. This behavior is similar to global variables, but with the key difference that the scope of static variables is specifically limited to the routine that defines them. Once initialized, they retain their value until the program ends, allowing data to persist between calls.
“Local” records and the variables they contain behave similarly to static variables in that they are shared across all invocations of their defining routine. However, the system might reclaim the memory allocated to local variables if it’s running low on memory. When this feature was introduced, computers had significantly less RAM, so local variables were a flexible choice for large data structures. There is no reason to use them today.
Groups
“Groups” allow for nested organization and fixed-size arrays. Although groups are frequently employed as composite data types, the preferred approach for new code is to use a “structure.” (We’ll get around to structures in the chapter on complex types.) This suggestion to prefer structures stems from the fact that these complex data types, even when implemented as group parameters, essentially function as alpha blobs. Consequently, the compiler’s type checker is unable to assist in detecting mismatches or incompatibilities, making structures a safer and more efficient option.
Commons and global data section (GDS)
Both the COMMON statement and GLOBAL data sections serve to establish shared data areas that are accessible by other routines within a program. However, they differ in how they link the shared data.
The COMMON statement, with its two forms, GLOBAL COMMON and EXTERNAL COMMON, is used to define records that are accessible to other routines or the main routine. GLOBAL COMMON creates a new data area, while EXTERNAL COMMON references data defined in a GLOBAL COMMON statement in another routine.
The main distinguishing factor for COMMON statements is that the data layout (types, sizes, field sequence, etc.) is fixed by the GLOBAL COMMON declaration and cannot be checked during the compilation of EXTERNAL COMMON statements. When these statements are compiled, the compiler creates a symbolic reference to the named field in the common, with the linking process determining the correct data address for each symbolic reference.
Global data sections are also defined in one routine and accessed from other routines, but a global data section is referenced by the name of the data section, rather than by data entities it contains (fields, etc.). Records are the top-level data definitions within GLOBAL-ENDGLOBAL blocks, so fields, groups, and other data entities in global data sections are all contained in records.
Here is an example that illustrates how commons are bound at link time:
global common
fld1, a10
fld2, a10
global common named
nfld1, a10
nfld2, a10
proc
fld1 = "fld1"
fld2 = "fld2"
nfld1 = "nfld1"
nfld2 = "nfld2"
Console.WriteLine("initial values set")
Console.WriteLine("fld1 = " + fld1)
Console.WriteLine("fld2 = " + fld2)
Console.WriteLine("nfld1 = " + nfld1)
Console.WriteLine("nfld2 = " + nfld2)
xcall common_binding_1()
xcall common_binding_2()
end
subroutine common_binding_1
common
fld1, a10
fld2, a10
common named
nfld1, a10
nfld2, a10
proc
Console.WriteLine("initial values common_binding_1")
Console.WriteLine("fld1 = " + fld1)
Console.WriteLine("fld2 = " + fld2)
Console.WriteLine("nfld1 = " + nfld1)
Console.WriteLine("nfld2 = " + nfld2)
xreturn
endsubroutine
subroutine common_binding_2
common named
nfld2, a10
nfld1, a10
common
fld2, a10
fld1, a10
proc
Console.WriteLine("initial values common_binding_2")
Console.WriteLine("fld1 = " + fld1)
Console.WriteLine("fld2 = " + fld2)
Console.WriteLine("nfld1 = " + nfld1)
Console.WriteLine("nfld2 = " + nfld2)
xreturn
endsubroutine
Output
initial values set fld1 = fld1 fld2 = fld2 nfld1 = nfld1 nfld2 = nfld2 initial values common_binding_1 fld1 = fld1 fld2 = fld2 nfld1 = nfld1 nfld2 = nfld2 initial values common_binding_2 fld1 = fld1 fld2 = fld2 nfld1 = nfld1 nfld2 = nfld2
You can see from the output that it doesn’t matter what order the fields are declared in. The linker will bind them to the correctly named fields in the COMMON statement. By contrast, the unique feature of global data sections is that they are essentially overlaid record groups, with each routine defining its own layout of a given global data section. This functionality was abused in the past to reduce the memory overhead of programs. The size of each named section is determined by the size of the global data section marked INIT. Here’s an example that shows how a global data section can be defined differently when it is accessed, which demonstrates the binding differences between COMMON and GDS:
global data section my_section, init
record
fred, a10
endrecord
record named_record
another_1, a10
another_2, a10
endrecord
endglobal
proc
fred = "fred"
another_1 = "another1"
another_2 = "another2"
Console.WriteLine("initial values set")
Console.WriteLine("fred = " + fred)
Console.WriteLine("another_1 = " + another_1)
Console.WriteLine("another_2 = " + another_2)
xcall gds_example_1
end
subroutine gds_example_1
global data section my_section
record named_record
another_1, a10
another_2, a10
endrecord
record
fred, a10
endrecord
endglobal
proc
Console.WriteLine("values in a routine, bound differently")
Console.WriteLine("fred = " + fred)
Console.WriteLine("another_1 = " + another_1)
Console.WriteLine("another_2 = " + another_2)
end
Output
initial values set fred = fred another_1 = another1 another_2 = another2 values in a routine, bound differently fred = another2 another_1 = fred another_2 = another1
You can see from the output that the field names within the GDS don’t matter to the linker. A GDS is just a section of memory to be overlaid with whatever data definitions you tell it to use.
In terms of storage, both COMMON and global data sections occupy separate spaces from data local to a routine, and the data space of a record following a COMMON or global data section is not contiguous with the data space of the COMMON or global data section.
Often in the past, developers chose to use COMMON and global data sections instead of parameters, either because they were told they were more performant or because refactoring wasn’t required when they included additional data in their routines. But here’s a list of reasons why you might want to avoid using global data instead of parameters:
-
Encapsulation and modularity: Routines that rely on parameters are self-contained and only interact with the rest of the program through their parameters and return values. This makes it easier to reason about their behavior, since you only have to consider the inputs and outputs, not any external state.
-
Easier debugging and maintenance: Since parameters limit the scope of where data can be changed, they makes it easier to track down where a value is being modified when bugs occur. If a routine is misbehaving, you generally only have to look at the routine itself and the places where it’s called. On the other hand, if a global variable is behaving unexpectedly, the cause could be anywhere in your program.
-
Concurrency safety: In multi-threaded environments like .NET, global variables can lead to issues if multiple threads are trying to read and write the same variable simultaneously, whereas parameters are local to a single call to a routine and don’t have this problem.
-
Code reusability: Routines that operate only on their inputs and outputs, without depending on external state, are much more flexible and can easily be reused in different contexts.
-
Testing: Routines that use parameters are easier to test, as you can easily provide different inputs for testing. With global variables, you need to carefully set up the global state before each test and clean it up afterwards, which can lead to errors.
Data
When the DATA keyword was introduced in DBL version 9, it brought significant improvements in the way variables were managed. DATA enables you to declare a variable in the procedure division of a routine, so you can define a variable where you use it. Here is a DBL .NET example:
proc
data my__int, int
if (expression)
my_int = 10
With Traditional DBL, DATA declarations must be declared at the top of a BEGIN-END block, before any other statements. (This restriction does not apply to DBL running on .NET.)
proc
...
begin
data my_int, int
if (expression)
my_int = 10
...
end
It’s worth noting that these variables are stored on the stack, so the scope of a DATA variable is limited to the BEGIN-END block it’s defined in or to the routine.
DATA declarations bolster code readability and maintainability by making it convenient to use variables for a single, well-defined purpose in a specific context. This makes it less likely that variables will be reused for different data types or for different purposes—a practice that can obscure a variable’s state and role in the code, making it challenging to understand and maintain.
Additionally, variables declared where they are used are more likely to have meaningful names that accurately reflect their purpose within their routines, which supports the creation of self-documenting code.
Type inference and DATA declarations
DATA declarations support type inference with the syntax data variable_name = some_expression
. With this syntax, the compiler deduces the type of a variable based on the assigned expression, eliminating the need for an explicit type declaration. Type inference can lead to more compact and, in some scenarios, more readable code. However, it can also result in less explicit code that can cause confusion, particularly in a team setting or when revisiting older code. This risk, however, is largely mitigated by modern integrated development environments (IDEs), which often show you the deduced type of the variable.
There are also pros and cons when it comes to refactoring. On the one hand, type inference can simplify refactoring. For instance, if you change the type of a function’s return value, you won’t need to update variables that infer their type from the function. This can expedite refactoring and reduce errors caused by incomplete changes. On the flip side, changing the expression assigned to a variable could unintentionally change the variable’s type, potentially introducing bugs that are difficult to detect. This risk is particularly high if the variable is used in many places or in complex ways, as the impact of the type change could be widespread and unexpected.
Type inference restrictions
If a routine return value is of type
a
,d
, orid
, the compiler will not be able to infer the data type for initial value expressions. Data types such as structures, classes, and sized primitives can be correctly inferred by the compiler.
TODO: note about GOTO/CALL out of scopes with local data declarations
DATA example
The following example shows the basics of data declarations:
proc
begin ;This enclosing BEGIN-END is only required in Traditional DBL.
;You can omit it on .NET.
data expression = true
if(expression)
begin
data explicitly_typed_inited, a10, "hello data"
data just_typed, int
just_typed = 5
Console.WriteLine(explicitly_typed_inited)
;;In Traditional DBL, the following line would cause an error if it wasn't commented:
;;data another_declaration, a10, "hello data"
end
end
Output
hello data
Scope shadowing
DBL follows variable shadowing rules similar to those in other languages, meaning that an identifier declared within a scope can shadow an identifier with the same name in an outer scope. For example, if a global variable and a local variable within a function have the same name, the local variable takes precedence within its scope. This follows the pattern of the most narrowly scoped declaration of a variable being silently chosen by the compiler. This can lead to situations where changes to the local variable do not affect the global variable, even though they share the same name. While shadowing can be used to create private instances of variables within scopes, it can also lead to confusion and errors if not managed carefully, as it may not be clear which variable is being referred to at a given point in the code. If you don’t already have code review conventions to manage this risk, they’re worth considering. Here’s a short example to illustrate the concept:
record
my_field, a10
proc
my_field = "hello"
begin
data my_field, d10
;;this is a totally different variable
my_field = 999999999
Console.WriteLine(%string(my_field))
end
;;once we exit the other scope, we're back to the original variable
Console.WriteLine(my_field)
Output
999999999 hello
Paths and abbreviated paths
You can declare the same field name multiple times within a group or named record structure. However, when accessing fields with the same name, ensure that the path used to reach the field is unique. This requirement is due to the compiler’s need for specificity when navigating nested structures. Here’s an example:
import System
main
record inhouse
accnt ,d5
name ,a30
address ,a40
zip ,d10
record client
accnt ,[100]d5
group customer ,[100]a
name ,a30
group bldg ,a
group address ,a
street ,a4
zip ,d10
endgroup
endgroup
group contact ,a
name ,a30
group address ,a
street ,a40
zip ,d10
endgroup
endgroup
endgroup
proc
;;These succeed because they're fully qualified:
Console.WriteLine(contact.name)
Console.WriteLine(contact.address.street)
Console.WriteLine(customer[5].bldg.address.street)
Console.WriteLine(client.accnt)
Console.WriteLine(inhouse.accnt)
Console.WriteLine(inhouse.name)
;;These succeed, though there are more deeply nested alternatives:
Console.WriteLine(name)
Console.WriteLine(customer[1].name)
;;These fail:
Console.WriteLine(client.name)
Console.WriteLine(address.street)
Console.WriteLine(accnt)
endmain
With the data structure above, the following paths are valid because they unambiguously lead to a single field:
contact.name
contact.address.street
customer[5].bldg.address.street
client.accnt
inhouse.accnt
inhouse.name
And the following paths are valid because they are actually fully qualified paths, even though there are more deeply nested variables with the same name. These would have failed prior to version 12.
customer[1].name
name
However, the following paths are invalid because they could refer to multiple fields:
client.name
address.street
accnt
Using any of these invalid paths will result in a helpful compiler error that will tell you what the matching symbols are.
Compiler output
%DBL-E-AMBSYM, Ambiguous symbol client.name 1>%DBL-I-ERTXT2, MAIN$PROGRAM.client.customer.name 1>%DBL-I-ERTXT2, MAIN$PROGRAM.client.customer.contact.name : Console.WriteLine(client.name)
Quiz
- What are the two main divisions in a DBL program?
- Procedure division and memory division
- Data division and procedure division
- Data division and memory division
- Procedure division and static division
- Which keyword allows for variable declarations within the procedure division in recent versions of DBL?
- Var
- Data
- Local
- Static
- What are the storage specifiers available for records and groups in DBL?
- Static, local, and global
- Stack, global, and local
- Stack, static, and local
- Local, global, and data
- What is the primary difference between “stack” and “static” variables in DBL?
- Stack variables are shared across the program, while static variables are unique to each function
- Static variables retain their value between function calls, while stack variables are deallocated when the scope is exited
- Stack variables persist until the program ends, while static variables are deallocated when the scope is exited.
- Static variables are shared across the program, while stack variables are unique to each function
- How are “data” variables stored in DBL?
- They are stored statically
- They are stored locally
- They are stored on the stack
- They are stored globally
- Which data containers in DBL can hold multiple related data items of various types?
- Functions
- Groups
- Records
- Variables
- True or false: It’s mandatory to put ENDRECORD at the end of a record declaration in DBL.
Primitives
In DBL, weakly typed descriptor types like alpha, decimal, implied decimal, and integer are not bound by strict data type constraints. This wasn’t really done on purpose—it’s the result of being developed in the days of single-pass compilers and very limited memory, where it was not possible to enforce strong typing. DBL’s continuation with this weak typing means that variables declared with these types can be assigned a variety of values or even manipulated in ways that are typically prevented in strongly typed systems. While this enables very old applications to move forward without significant costly refactoring, it increases the risk of type-related errors, necessitating a more cautious and thorough approach to debugging and data handling. It is possible to tell the modern DBL compiler to enforce strong typing, but it requires well-organized projects and setting a few compiler switches.
Consider the following example, where a routine expects an alpha parameter, but a caller instead passes a decimal. In a strongly typed language, this would result in a compile-time error. However, with the right set of compiler switches, the compiler for Traditional DBL will allow a decimal to be passed in. At runtime there may be no ill effects, depending on the value. A negative number would result in an unexpected alpha, while the result of a positive number would look as if the routine caller had passed the correct type.
proc
xcall test_parameters(5, "5", 5)
xcall test_parameters(-5, " 5", -5)
xcall test_parameters(0, "p5zdfsdf", 0)
endmain
subroutine test_parameters
param1, a
param2, d
param3, n
proc
Console.WriteLine(param1)
Console.WriteLine(%string(param2 + 5))
Console.WriteLine(%string(param3 + 5))
xreturn
endsubroutine
Output
5 10 10 u 10 0 0 -5:46341 5 %DBR-S-STPMSG, STOP
You can see from the u
and -5:46341
outputs that this scenario wouldn’t be a good idea in a real program. You are likely to encounter code like this in legacy DBL programs, but it’s very unlikely that it will manifest itself in this obvious way. Because of the age and relative stability of most DBL code, it’s more likely with production code that some rarely taken code path is not being tested and is almost never seen in production.
DBL can also enforce a more rigid type system, as seen with the string type. Here, a variable declared as a string can only hold string data, and any operation that attempts to change its type will result in a compile-time error. This strict type enforcement promotes data consistency and type safety, reducing runtime errors related to unexpected data conversion. Developers must perform explicit type conversions and cannot rely on the language to coerce types implicitly, leading to more predictable though verbose code.
Alpha
An alpha value (a
, a*
, a{size}
) is a sequence of printable ASCII characters treated as a single information unit.
a
: An alpha parameter, return type, or method property.a*
: An alpha data field with size determined by its initial value.a{size}
: An alpha data field of specified size, default filled with spaces. For example,a10
is a 10-character alpha.
Platform limits
Alphas have the following maximum-length restrictions on certain platforms:
- 65,535 single-byte characters on 32-bit Windows and 32-bit Linux when running Traditional DBL
- 32,767 single-byte characters on OpenVMS
- 2,147,483,647 single-byte characters on all other platforms
Decimal and implied decimal
Decimal (d
, d*
, d{size}
) and implied-decimal (d.
, d{size}.{precision}
, decimal
) types in DBL handle numbers as sequences of ASCII characters, ensuring an exact representation. Both decimal and implied-decimal types are signed, meaning they can represent both positive and negative numbers.
In a typical DBL program, the avoidance of floating-point numbers like float
and double
(which are discussed below) is deliberate. Floating-point representations can introduce rounding errors due to their binary format, which cannot precisely depict most decimal fractions. This imprecision, although minuscule in a single operation, can compound in financial contexts, leading to significant discrepancies. Therefore, DBL programmers rely on decimal and implied-decimal types for monetary computations to preserve data integrity.
Integer
An integer (i
, i*
, i1
, i2
, i4
, and i8
) is a byte-oriented, binary representation of a signed whole number. The integer value depends on its usage and can be a value type or descriptor type.
Numeric
Numeric types (n
and n.
) define numeric parameters that can pass any of the numeric data types: decimal, packed (discussed below), or integer for n
and implied decimal or implied packed for n.
. It is not possible to declare a numeric field, only a parameter or return type.
Ownership
Sized declarations
When declaring an a
, d
, i
, or id
variable somewhere that defines a memory layout or owns its own memory, you must include a size. Let’s consider an example where we define several fields in a record:
record
myAField, a100
myDField, d10
myIdField, d10.3
myIField, i4
The field myAField
allocates a memory space of 100 bytes and tells the compiler to interpret that space as an alpha (or string) type. On the other hand, myIdField
allocates 10 bytes of memory but specifies that the last three of those bytes should be interpreted as decimal places. This is an implied-decimal declaration. The decimal places are implied within the byte storage of the field, effectively allowing you to store a decimal number within an ASCII numeric space. If, for example, the memory allocated to myIdField
contained the value 1333333333
, it would be interpreted and displayed as 1333333.333
when printed, due to the implied-decimal places.
Unsized declarations
The size of parameters and return types is determined at runtime, rather than being defined during compile time. This means that the size of these variables is based on the actual data that is passed or returned during the execution of your program.
When declaring a field like an alpha (a
) type within a record, structure, group, or class, you would usually specify the size, for example, myFld, a10
. However, for parameter or return type declarations, the size is omitted, and you would declare it simply as myFld, a
.
The size of this unsized parameter will then be determined by the length of the argument passed at the time of calling. For example, if you call your routine with xcall myRoutine("a short alpha")
, the size of the myFld
parameter would be 13 at this particular call site, reflecting the length of the string a short alpha
.
main
proc
xcall myRoutine("a short alpha")
endmain
; Routine definition
subroutine myRoutine
in req myFld, a
proc
Console.WriteLine(^size(myFld))
endsubroutine
Output
13
Less common types
Packed and implied packed
Data in packed or implied-packed form (p
, p{size}
, p.
, or p{size}.{precision}
) is stored as two digits per byte, plus an extra byte for the sign. This data type is uncommon and can usually be migrated to decimal and implied decimal without significant trouble.
Boolean
A boolean data type represents a true/false value, defaulting to false.
Byte
In Traditional DBL, byte represents an eight-bit signed integer (i1
), whereas in DBL on .NET, it’s an eight-bit unsigned integer mapping to the .NET System.Byte structure. The default byte value is 0.
Char
In Traditional DBL, char is a 16-bit numeric value (C# char), allowing a DBL program to write records read by C# programs.
Double
In DBL on .NET, double is a value type mapping to the .NET System.Double structure, defaulting to 0.0. It’s not recommended for Traditional DBL.
Float
In DBL on .NET, float maps to the .NET System.Single structure, defaulting to 0.0. It’s not recommended for Traditional DBL.
IntPtr and UIntPtr (DBL on .NET only)
The .NET System.IntPtr and System.UIntPtr are unsigned integers of native size, depending on the platform.
Long
In Traditional DBL, long maps to i8
. In DBL on .NET, it’s a value type mapping to System.Int64, defaulting to 0.
Sbyte
In Traditional DBL, sbyte maps to i1
. In DBL on .NET, it represents an eight-bit signed integer and maps to the .NET System.Sbyte structure. The default value is 0.
Short
In Traditional DBL, short maps to i2
. In DBL on .NET, it’s a value type that maps to System.Int16 and defaults to 0.
Quiz
What are the three forms of alpha types in DBL?
- a, alpha, a10
- a, a*, a{size}
- a, a10, a*
- a, a*, a{size}+
What’s the maximum length of alpha types in 32-bit Windows and Linux when running Traditional DBL?
- 65,535 characters
- 32,767 characters
- 2,147,483,647 characters
- 2,147,483,648 characters
In Traditional DBL, how is a byte represented?
- Eight-bit unsigned integer
- Eight-bit signed integer
- Sixteen-bit signed integer
- Sixteen-bit unsigned integer
What is the main difference between the decimal and implied-decimal types in DBL?
- Decimal types are unsigned; implied-decimal types are signed
- Decimal types are whole numbers; implied-decimal types have a sized precision
- Decimal types have a fractional precision; implied-decimal types are whole numbers
- Decimal types are signed; implied-decimal types are unsigned
What is the default value for the .NET System.Double structure in DBL?
- 1.0
- 0.0
- NULL
- Undefined
Which types does short map to in Traditional DBL and DBL on .NET, respectively?
- i2 and System.Int32
- i2 and System.Int64
- i2 and System.Int16
- i4 and System.Int16
When defining a record, when must you include a size for these DBL types: a, d, i, id?
- Only when the fields need to be stored in an array
- Only when the fields will be processed in a loop
- When the fields define a memory layout (this is most of the time)
- When the fields are initialized with a certain value
In DBL, when is the size of parameters and non-^VAL return types determined?
- At compile time
- At runtime
- When they are initialized
- When they are declared
What happens when you pass the string “a slightly longer alpha” to a routine with an unsized alpha parameter
myFld, a
?
- An error occurs as the parameter is unsized
- The parameter myFld will have a size of 23
- The parameter myFld will have a size of 24, including the end character
- The program will crash
Literals
In DBL, the term “literal” refers to a specific, unchangeable value that’s defined at compile time. This value can either be a textual literal, which is a sequence of characters enclosed between matching single or double quotation marks, or a numeric literal, which consists of numeric characters and can be prefixed with a plus or minus sign.
Now, let’s look at the various types of literals:
Alpha literals
An alpha literal in DBL is essentially a string of characters. These characters are enclosed in either single or double quotation marks and can be up to 255 characters long. For example:
'This is an ALPHA literal' "This is also an ALPHA literal"
To include a quotation mark in an alpha literal, you must use two successive quotation marks. However, if the embedded character is different from the ones delimiting the literal, you don’t need to double it.
For example:
"This is an ""embedded"" quotation mark" 'This is an "embedded" quotation mark'
You can also split an alpha literal into smaller parts, and the DBL compiler will concatenate these parts if they are separated by blanks and tabs. Note that alpha literals are case sensitive.
It’s important to remember that DBL does not support the C-style escape sequences found in some other languages. For instance, while in C or C# you might use \r\n
to represent carriage return and newline, in DBL you would use + %char(13) + %char(10)
. So, to concatenate the words “hello” and “world” with a newline in between, you would write it as "hello" + %char(13) + %char(10) + "world"
.
Decimal literals
A decimal literal is a sequence of numeric characters, which can be positive or negative, and can include up to 28 digits. Leading zeros without significance are removed by the compiler. Examples include
273949830000000000
1
-391
Implied-decimal literals
Implied-decimal literals are similar to decimal literals but include a decimal point. There can be up to 28 digits on either side of the decimal point. For instance:
1234.567
123456789012345678.0123456789
+18.10
0.928456989
-518.0
The compiler removes nonsignificant leading or trailing zeros before processing.
Integer literals
In DBL, you can’t directly write integer literals like you would alpha, decimal, or implied-decimal literals. However, when a decimal literal is part of an arithmetic expression with an integer variable, the compiler generates an integer literal in the code.
Literal definition
Note this is a weird failed concept; you don’t need to keep doing it.
The LITERAL declaration allows you to define a local, read-only data structure that cannot be altered during the execution of the program. The companion statement ENDLITERAL marks the conclusion of this declaration.
You can declare a literal within a routine. Depending on your needs, a literal can be either global, which results in the allocation of data space, or external, which simply references the data space. The fields within an EXTERNAL LITERAL map to the data in the corresponding GLOBAL LITERAL based on matching field names. If you do not specify either GLOBAL or EXTERNAL, the literals are added to the local literal space.
When you provide a name for your literal, it can be accessed as an alpha data type. If no name is given, the literal is deemed unnamed. Bear in mind that each literal’s name must be distinct within the routine it is declared in.
For unnamed literals, the data space related to a field is not created until the field is referenced for the first time. In contrast, if a literal is named, all of its contents are instantiated unconditionally. If a group is defined within an unnamed literal, the group’s contents are also unconditionally created, even if they are not referenced.
Note that if you do not specify the size of an integer global literal, DBL defaults the size to i4. As an example, in the following definition, lit
is created as an i4 literal:
global literal
lit ,i, 2
For an external literal, automatic size specification (*) is not permitted.
You are only allowed to specify a field position indicator for a literal field under two conditions: if the literal record is named, or if the field is located within a GROUP-ENDGROUP block. For instance, the following would be considered an invalid literal declaration:
literal ;Invalid literal declaration
lit1 ,i4
lit2 ,i4 @lit1
It’s possible to overlay a literal record onto another if the overlaid record is named. However, you cannot specify a packed data field within the scope of a literal field.
Boxing literals
In the .NET environment, literals or those cast as type “object” undergo a type conversion from a DBL literal type to a .NET literal type before being boxed. For instance, the literal “abc” is changed to a string type, and the number 10 becomes @int. If you wish to retain an alpha, decimal, or implied-decimal literal type, simply cast the literal as the desired DBL type (@a, @d, @id). Boxing literals usually happens when adding a literal directly to an object collection or passing it as a parameter to a routine that takes an object parameter.
Comments
All programmers strive to make their code easy to understand, but sometimes extra explanation is warranted. In these cases, programmers leave comments in their source code that the compiler will ignore but that people reading the source code may find useful.
Here’s a simple comment:
; hello, world
The idiomatic comment style starts a comment with one or two semicolons, and the comment continues until the end of the line. For comments that extend beyond a single line, you’ll need to include ;
on each line, like this:
; I'm going to need a lot of space to explain this concept fully,
; multiple lines in fact. Whew! I hope this comment will
; explain what’s going on.
Comments can also be placed at the end of lines containing code:
Console.WriteLine("Hello World") ;This is a comment
TODO: add a section describing common patterns in dbl codebases. Routine top comments and the usage as a change log. make suggestions on what to do going forward.
Documentation comments
When compiling for .NET, similar to XML documentation in C#, triple semicolons (;;;
) are used to create documentation comments. These are a special kind of comments that can be processed by a documentation generator to produce software documentation. They are typically placed immediately before the code statement they are annotating.
Documentation comments can be extracted to generate external documentation or used by integrated development environments (IDEs) and code editors to provide contextual hints and auto-completion suggestions.
The XML tags used in DBL documentation comments are identical to those used in C# XML documentation. Some of the commonly used XML tags include
<summary>
: Provides a short description of the associated code.<param name="paramName">
: Describes a parameter for a method or function.<returns>
: Describes the return value of a method or function.<remarks>
: Provides additional information about the code, such as any special considerations or usage information.<exception cref="exceptionName">
: Indicates the exceptions a method or function might throw.<seealso cref="memberName">
: Creates a link to the documentation for another code element.
namespace doc_comment
class ADocumentedClass
;;; <summary>
;;; This function performs a calculation.
;;; </summary>
;;; <param name="input1">The first input value.</param>
;;; <param name="input2">The second input value.</param>
;;; <returns>The result of the calculation.</returns>
;;; <remarks>This function uses a specific algorithm to perform the calculation.</remarks>
method Calculate, int
input1, int
input2, int
proc
mreturn input1 * input2
end
endclass
endnamespace
Assignment
The equal sign (=
) is used as an assignment operator. It takes the value from one operand and stores it in the other but with quite a few special behaviors. The left operand must be a variable specification, with the right operand treated as an expression. The evaluated expression is stored in the variable before the variable is used in the rest of the expression.
The basic syntax for an assignment operation is
destination = source
destination
: A simple, subscripted, or ranged variable that will be modified.source
: An expression whose evaluated result is assigned to the destination variable.
Assignment operator examples
For instance, the following assignment operation:
xcall sub(a=1, b=2, c="ABC")
is equivalent to
a=1
b=2
c="ABC"
xcall sub(a, b, c)
It’s possible to create operations like X + Y = 3
, where Y
is set to 3
before it’s added to X
. This allows for expressions like if ((len=^size(arg)).eq.4)
.
In an assignment statement like decimal_var = decimal_var * 3
, the current value of decimal_var
is multiplied by 3
, then reassigned as the new value for decimal_var
.
DBL typically rounds results when storing from an implied-decimal source expression. To truncate results, use the %TRUNCATE function or set the TRUNCATE option on MAIN, FUNCTION, and SUBROUTINE statements, or set system option #11.
TODO: init and clear
Data type assignments
The rules for moving data between variables of different types vary based on the data types involved.
Alpha to alpha
When the source and destination are of the same length in an alpha-to-alpha assignment, data is directly copied from the source to the destination. However, if the lengths differ, specific rules are followed:
- When the source data length is shorter than the destination’s, the data is copied from the source to the destination starting from the left. Any remaining space in the destination is filled with blank spaces to the right.
- When the source data length exceeds that of the destination, only the leftmost characters from the source are copied until the destination space is filled. Notably, this process does not trigger any warning or error, meaning the extra characters from the source are simply ignored and not copied.
Here are some examples of alpha-to-alpha assignments:
record xyz
result ,a4
afld1 ,a6, "abcdef"
afld2 ,a2, "xy"
proc
Console.WriteLine(result = afld2)
Console.WriteLine(result = afld1)
Console.WriteLine(result = "1234")
Output
xy abcd 1234
Alpha to string
Directly assigning an alpha to a string may not yield the desired results, as whitespace behaves differently in strings and alphas. In strings, leading whitespace is significant, while in alphas it’s generally ignored. Therefore, if you assign an alpha to a string as in string_var = a500Var
, you’ll get a 500-character-long string, even if it’s all whitespace. To circumvent this and ignore leading whitespace, use string_var = %atrimtostring(a500var)
Alpha to numeric
The process of assigning an alpha to a numeric destination involves several steps:
- Evaluate the source: The alpha source data is evaluated to produce a numeric value.
- Assign according to type: The resulting numeric value is assigned to the destination according to the rules for moving a value of the source’s data type to the destination’s data type.
- Check for invalid characters: During the conversion process, if a character that isn’t a blank, decimal point, sign (+ or –), or numeric digit is encountered, a “Bad digit encountered” error ($ERR_DIGIT) occurs. Note that sign characters can be positioned anywhere within the alpha source and are applied in the order they appear. Blanks and plus signs (+) are disregarded, and only one decimal point is permitted.
- Identify the numeric type: If a decimal point is present in the source, the evaluated result is considered implied-decimal. The rules for moving an implied-decimal source to an implied-decimal destination are then followed. If no decimal point is present, the result is treated as a decimal, and the rules for moving a decimal source to a decimal destination are adhered to.
To illustrate these concepts, here are some examples of alpha-to-decimal assignments:
record results
decvar ,d6
impvar ,d5.3
int4var ,i4
int2var ,i2
int1var ,i1
proc
Console.WriteLine(decvar = "-1-2-3")
Console.WriteLine(decvar = "123456789") ;overflow results in loss of leftmost digits
Console.WriteLine(decvar = " 3 5 8 ")
Console.WriteLine(decvar = "9.78") ;9 if compiled with DBLOPT #11 (rounding vs truncating)
Console.WriteLine(impvar = " 6448.3") ;overflow results in loss of leftmost digits
Console.WriteLine(impvar = "54.32")
Console.WriteLine(impvar = "19.3927") ;19.392 if compiled with DBLOPT #11
Console.WriteLine(int1var = "456") ;overflow results in loss of high-order byte
Console.WriteLine(int2var = "-231.796") ;-231 if compiled with DBLOPT #11
Console.WriteLine(int4var = "123456789") ;123456789
Console.WriteLine(decvar = "abcde")
Output
-123 456789 358 10 48.300 54.320 19.393 -56 -232 123456789 Unhandled exception. Synergex.SynergyDE.BadDigitException: Bad digit encountered
Numeric to alpha
When assigning numeric data to an alpha destination, the runtime must format numeric data to fit the destination. The assignment takes into account whether the format is implicitly or explicitly specified in the statement.
If the source data is integer, packed, or implied-packed, it is first converted to decimal format before the assignment.
Rules for implicit formatting:
If the source is a numeric expression, the assignment follows these rules:
- Load right-justified significant digits: The significant digits from the source are right-justified and loaded over leading blanks in the destination.
- Load negative sign: If the source value is negative, a minus sign is placed immediately before the leftmost digit.
- Insert decimal point: For implied-decimal sources, the decimal point is inserted between the whole number part and the fractional precision.
- Handle larger source than destination: If the source value, including any decimal point, has more digits than the destination can hold, only the rightmost part is transferred. If the source is negative, the minus sign is omitted without raising any warning or error.
- Handle equivalent source and destination sizes: If the number of digits and any decimal point in the source is the same size as the destination, all digits are transferred. Similar to the previous rule, if the source value is negative, the minus sign is omitted, and no warning or error is raised.
Let’s look at some examples of decimal-to-alpha assignments following these implicit formatting rules. Notice the leading whitespace and the effect of storing into the a9
vs a6
fields:
record
alpha6_result ,a6
alpha9_result ,a9
record
dfld1 ,d3, -23
dfld2 ,d6, -123456
impfld1 ,d4.2, 68.54
impfld2 ,d12.4, 12345678.9876
intfld1 ,i1, 99
intfld2 ,i2, 1003
intfld4 ,i4, 82355623
proc
Console.WriteLine(alpha6_result = dfld1)
Console.WriteLine(alpha6_result = dfld2)
Console.WriteLine(alpha6_result = impfld1)
Console.WriteLine(alpha6_result = impfld2)
Console.WriteLine(alpha9_result = dfld2)
Console.WriteLine(alpha9_result = impfld2)
Console.WriteLine(alpha6_result = intfld1)
Console.WriteLine(alpha6_result = intfld2)
Console.WriteLine(alpha6_result = intfld4)
Console.WriteLine(alpha9_result = intfld4)
Output
-23 123456 68.54 8.9876 -123456 5678.9876 99 1003 355623 82355623
Explicit formatting
To format an assignment between a numeric and an alpha value, use the format destination = source, format
. The format specification depicts how the destination will look. The formatted result fills the destination starting from the right, with any overflow ignored without an error or warning. You will likely encounter this sort of formatting in your codebase, but know that when writing new code using .NET, you can also use String.Format
and its associated formatting literals.
The format specification contains certain case-sensitive characters:
- X: This uppercase character represents a single digit from the source data. The digit is placed into the destination starting from the right and continuing to the left. Leftover X positions are filled with zeros.
- Z: This uppercase character also represents a digit but behaves differently from X. Extra Z positions (when there are fewer digits in the source than Z characters in the format) are filled with blanks, unless there is a period or X to its left.
- *****: The asterisk also represents a digit position. When there are no more significant digits to be transferred, the position is filled with an asterisk, not a zero.
- Money sign: This character functions similarly to Z, but when there are no more digits, the position is filled with a money sign (default “$”).
- –: This minus sign, when placed as the first or last character in a format, indicates negativity. If the source is positive, a blank space is loaded.
- .: This period character causes a decimal point to be placed at the corresponding position in the destination.
- ,: This comma character is loaded at the corresponding destination position if more source digits remain to be transferred and an X character appears to its left.
Avoid using formatting characters as normal text, as they might cause unexpected results. If you need to substitute formatting characters due to regional preferences, use the LOCALIZE subroutine.
In the following example, you’ll notice the format is different because the comma would instead mean a second parameter to Console.WriteLine
instead of an explicit format specifier.
record
alpha ,a10
proc
alpha = 987, "XXXXXX-"
Console.WriteLine(alpha)
alpha = -987, "XXXXXX-"
Console.WriteLine(alpha)
alpha = 987, "-XXX"
Console.WriteLine(alpha)
alpha = 987, "XXXXXX"
Console.WriteLine(alpha)
alpha = 987, "ZZZZZZ"
Console.WriteLine(alpha)
alpha = -987, "-ZZZZZZ"
Console.WriteLine(alpha)
alpha = 987, "******"
Console.WriteLine(alpha)
alpha = 98765, "Z,ZZZ,ZZZ"
Console.WriteLine(alpha)
alpha = 98765, "*,***,***"
Console.WriteLine(alpha)
alpha = 9, "***.**"
Console.WriteLine(alpha)
alpha = 9876, "$$$,$$$.$$"
Console.WriteLine(alpha)
alpha = 9876, "$$*,***.XX"
Console.WriteLine(alpha)
alpha = 9876, "Val: Z.ZZ"
Console.WriteLine(alpha)
alpha = 95, "This puts a X in"
Console.WriteLine(alpha)
Output
000987 000987- 987 000987 987 - 987 ***987 98,765 ***98,765 ***.09 $98.76 $***98.76 Val: 8.76 uts a 5 in
Explicit justify
When you make a numeric-to-alpha assignment, the formatted information gets loaded right-justified by default. To change it, you can include a justification control—either LEFT
or RIGHT
—at the end of the assignment statement, like this: statement[ [justification[:variable]]]
. Here, variable is updated with the number of characters loaded into the destination, not counting leading blanks.
The following examples illustrate the usage and combination with explicit formatting. As with explicit formatting, this syntax does not work when the assignment operator is not the only thing on the line.
record
alpha ,a10
len ,i2
proc
len = 7
Console.WriteLine(alpha = 12345)
alpha = 12345 [LEFT]
Console.WriteLine(alpha)
alpha = 12345 [RIGHT:len]
Console.WriteLine(alpha)
len = 6
alpha = 12345, "ZZ,ZZZ.ZZ-"
Console.WriteLine(alpha)
alpha = 12345, "ZZ,ZZZ.ZZ-" [LEFT]
Console.WriteLine(alpha)
alpha = 12345, "ZZ,ZZZ.ZZ-" [RIGHT:len]
Console.WriteLine(alpha)
Output
12345 12345 12345 123.45 123.45 123.45
Strings
The term “string” is a fundamental concept representing a sequence of characters. These characters can range from letters and numbers to symbols and whitespace. Strings are used in virtually all programming languages to handle and manipulate text data. In DBL, two types are primarily used for storing string data: alpha and the built-in class System.String.
To avoid confusion, it’s important to understand that “strings” as a universal concept differ from System.String in terms of abstraction levels. While “strings” are a ubiquitous programming concept, System.String is a specific implementation in Traditional DBL and .NET. It provides a set of built-in methods to handle and manipulate text-based data.
Both alphas and System.String types are extensively used for various purposes, like storing and displaying messages and representing names, addresses, and virtually any other type of written information.
Common string operations include
- Concatenation: Joining two or more strings together.
- Length calculation: Determining the number of characters a string contains.
- Substring extraction: Taking part of a string.
- String replacement: Replacing a specific part of a string with another string.
- Case conversion: Changing a string to all uppercase or all lowercase letters.
- Comparison: Checking if strings are identical or determining their alphabetical order.
TODO Note
include when to use LOCALIZE, INSTR, LOCASE, UPCASE, %STRING, %ATRIM, %ATRIMTOSTRING, %TRIMZ, TRIM vs ATRIM, %CHAR, %UCHAR, make note of ranging in the advanced_memory_managment chapter
Alpha-specific routines
These routines are specific to alpha data, and while the compiler will allow you to pass a string to them, you will get unexpected results if the length of the string data is larger than the size limit for alphas on your platform. As a general recommendation, avoid using these routines with strings.
%ATRIM
%ATRIM is used to remove trailing spaces from an alpha. Its syntax is
trimmedAlpha = %ATRIM(alpha_expression)
The %ATRIM function is designed to return an alpha descriptor that points to the same original alpha expression, but with its length adjusted to exclude any trailing blanks. This means that while the returned alpha descriptor represents the trimmed version of the string, the underlying data in memory remains unchanged. For example, if you have an alpha string “Hello “ (with trailing spaces), %ATRIM will return a descriptor that effectively “views” this string as “Hello” without the spaces. However, the original string in memory still contains the spaces. This subtle yet significant behavior of %ATRIM ensures that the original data is preserved, allowing for efficient string manipulation without the overhead of duplicating data. It is particularly useful in any scenarios where you have a large alpha buffer and don’t want and don’t want a temp record or to show a bunch of extra white space.
%ATRIMTOSTRING
%ATRIMTOSTRING is used to remove trailing spaces from an alpha and return the result as a string. Its syntax is
trimmedString = %ATRIMTOSTRING(alpha_expression)
%ATRIMTOSTRING works just like %ATRIM, but it’s more efficient in scenarios where the ultimate destination is a System.String. This is a very common use case when using the newer built-in APIs and/or libraries from the .NET ecosystem.
%TRIM / %TRIMZ
%TRIM returns the length of the passed-in expression minus the quantity of trailing whitespace. Its syntax is
trimmedLength = %TRIM(alpha_expression)
The trimmed length for an otherwise blank alpha expression is 1 for %TRIM and 0 for %TRIMZ. That’s the entire difference between the two. My preference is %TRIMZ because I want to treat blank strings as blank strings and not as strings with a single space in them. I think this is a more intuitive behavior, but you may see code that uses %TRIM instead.
UPCASE / LOCASE
UPCASE and LOCASE are used to convert an alpha to all uppercase or lowercase. They are similar to the ToUpper and ToLower methods on System.String. The main functional difference between UPCASE/LOCASE and ToUpper/ToLower is that UPCASE/LOCASE performs its operation in place, modifying the original data. The syntax is
UPCASE(alpha_expression)
LOCASE(alpha_expression)
System.String specifics
An important thing to note is that System.String is “immutable,” meaning it can’t be changed once created. Any operation that seems to modify a System.String is actually creating a new one. Also, since System.String is an object, it can’t be included directly in a record or structure intended for writing to disk or network transmission. While it’s possible to perform these operations, extra coding is required due to the object’s non-fixed size at compile time.
System.String can be treated as a collection of characters. In .NET, this is a fixed-width 16-bit character set commonly referred to as UTF-16, but the actual implementation is an earlier variant, UCS-2. In Traditional DBL, System.String is made up of 8-bit characters. Because System.String is a collection, you can index into it or iterate over it using a FOREACH loop.
Building strings
Options for string concatenation and construction include +
directly on a System.String or alpha, StringBuilder, and S_BLD. StringBuilder gets its design from .NET and is an optimal choice for operations requiring repetitive concatenation due to its improved performance. S_BLD predates StringBuilder and has significantly more options for formatting the DBL types
The syntax for S_BLD is
xcall S_BLD(destination, [length], control[, argument, ...])
In this syntax, destination
is the variable loaded with the formatted string. It can be either an alpha or a StringBuilder type. Length
is an optional variable loaded with the formatted string’s length. Control
is the processing control string for the build, and argument
can include up to nine arguments as required by the control string.
The S_BLD subroutine builds a formatted string. If the destination is an alpha field, it’s cleared before being loaded with new text from the first position. However, if the destination is a StringBuilder object, the generated string is appended to the object’s existing text.
An example using S_BLD is
xcall s_bld(sb,,"%%+07.03=d => {%+07.03=d}", f2)
The control string can consist of one or more text and/or format segments. Text segments are copied directly into the destination field. If the first two characters of the control are %&
, the output string is appended to the existing content. Otherwise, the destination is replaced with new text.
To achieve results equivalent to a Synergy decimal-to-alpha conversion, use a control string of “%0.0nd”, where n is the desired precision.
A format segment takes the next argument from the S_BLD call and formats it into the destination field. It follows this syntax:
%[justification size[.precision]][=]type
Here, justification
can be either left or right within the specified size. Size
is the minimum width of the formatted result field, and precision
is the displayed fractional precision. The presence of =
indicates no leading strip processing. Type
specifies the type of the next argument to be consumed, either A
for alpha type or D
for numeric type.
Arithmetic
Arithmetic operations are performed using various operators such as addition (+), subtraction (-), multiplication (*), division (/ and //), and, when running on .NET, modulo (.mod.). These operators execute standard, signed, whole number arithmetic if the operands don’t have an implied decimal point. However, if at least one operand contains an implied decimal point, the intermediate result will also include a fractional portion.
Unary plus (+) and minus (-) operators have a special function to denote whether a numeric operand is positive or negative. The unary plus operator is generally ignored, as unsigned values are assumed to be positive, whereas the unary minus operator changes the sign of the operand to its right. If multiple minuses are used consecutively, they are combined algebraically, with two minuses forming a plus, three minuses forming a single minus, and so forth. Importantly, the resulting data type from such operations will mirror that of the operand.
In instances where different numeric types are involved in the same operation, the lower type in the numeric data hierarchy (which includes integer, decimal, implied-decimal, packed, and implied-packed types) is upgraded to match the higher one. Therefore, if an integer and an implied-decimal value are part of the same expression, the integer will be promoted to an implied-decimal before the operation, resulting in an implied-decimal outcome.
In division operations, it’s important to remember that dividing by zero is not allowed and will result in an error. If the division operation (“/”) involves two whole numbers, the fractional part of the result will be discarded without rounding. However, using the “//” division operator always results in an intermediate result that includes a fractional portion. For the modulo operation, available only in DBL on .NET, it provides the remainder of a division operation between two numbers.
Lastly, it’s crucial to note that when a numeric field is used in an arithmetic expression, the language checks for invalid characters or an excessive length. Nonnumeric characters other than a blank, a decimal point, or a sign (+ or -) are not allowed in the operands, and any violation of these rules results in errors. If invalid characters sound odd to you, rest assured we will explain how this happens and how to avoid it when discussing project structure and prototyping. Similarly, if a field exceeds the maximum size for its data type, an error is generated.
Explicit rounding
When manipulating financial data, such as calculating taxes or discounts, accurate rounding of decimal values is a hard requirement. Even slight rounding errors can accumulate over numerous transactions, leading to significant discrepancies. Similarly, when processing user input, rounding to the nearest whole number or a specific decimal place is useful in maintaining output consistency and limiting the precision to an appropriate level for the application’s context. Moreover, while it may be beneficial to retain higher precision in internal computations, rounding can enhance readability when presenting these results to users. In this way, rounding serves multiple purposes, from ensuring accuracy in financial operations to streamlining user interactions and enhancing data presentation.
Rounding operators, specifically the # and ## operators, are often used to manipulate numeric values in precise ways. The # operator is a binary operator that serves to round the left operand, or the value being rounded, according to the number of digits specified by the right operand, the round value. For instance, if the round value is 3, the operation will discard the rightmost three digits of the value being rounded and add 1 to the resulting number if the leftmost discarded digit is 5 or greater.
Both operators follow these shared rules:
- Both operands must be decimal or integer.
- The sign of the result is the same as the sign of the value being rounded.
- If the number of significant digits in the value being rounded is less than the round value, the result is zero.
The #
operator, also known as the truncating rounding operator, rounds the left operand (the value being rounded) by the number of digits specified by the right operand (the round value). The distinctive rules for the #
operator are as follows:
- The round value must be in the range 1 through 28. Any value outside this range triggers an “Invalid round value” error.
- The resulting data type is identical to that of the original operand.
Conversely, the ##
operator allows for true rounding (rounding without discarding any digits in the value to be rounded). Its unique rules are as follows:
- The round value must be in the range -28 through 28.
- If the round value is positive, rounding begins “round value” places to the left of the decimal point. For example, if the round value is 3, rounding begins three places to the left of the decimal point.
- If the round value is negative, rounding begins “round value + 1” places to the right of the decimal point. For example, if the round value is -1, rounding begins two places to the right of the decimal point.
- The resulting data type is always decimal.
Examples
The following examples demonstrate the use of the #
and ##
operators:
record
tmpdec, d9, 12345
proc
Console.writeLine(%string(12345.6789 ## 3))
Console.writeLine(%string(12345.6789 ## 1))
Console.writeLine(%string(12345.6789 ## 30))
Console.writeLine(%string(12345.6789 ## -3))
Console.writeLine(%string(12345.6789 ## -1))
Console.writeLine(%string(12345.6789 ## -30))
Console.writeLine(%string(tmpdec # 3))
Console.writeLine(%string(tmpdec # 1))
;;fails with DBR-E-RNDVAL, Invalid round value: 30
Console.writeLine(%string(tmpdec # 30))
Output
12000 12350 0 12345.679 12345.7 12345.6789 12 1235 %DBR-E-RNDVAL, Invalid round value: 30
Comparison
Comparison, or relational, operators are used to evaluate the relationship between two values, or operands. They return a Boolean result, true or false, which is internally represented as 1 and 0, respectively. Relational operators are predominantly used in conditional statements, like IF, USING, or WHILE loops, to guide program flow based on the results of comparisons. For instance, they can be used to check if a user input matches a specific value, if a number is greater or smaller than a threshold, or if a string is alphabetically before or after another one. By comparing values and acting upon those comparisons, programs can make decisions and perform actions that adapt to specific conditions or user inputs. We’ll talk more about controlling program flow in the next section.
Here are the basic comparison operators:
- Equal to (== or .EQ.): Returns
true
if the left and right operands are equal. - Not equal to (!= or .NE.): Returns
true
if the left and right operands are not equal. - Greater than (> or .GT.): Returns
true
if the left operand is greater than the right one. - Less than (< or .LT.): Returns
true
if the left operand is less than the right one. - Greater than or equal to (>= or .GE.): Returns
true
if the left operand is greater than or equal to the right one. - Less than or equal to (<= or .LE.): Returns
true
if the left operand is less than or equal to the right one.
These descriptions are accurate for numbers and generally true for strings and objects, but things get a little weird when comparing alphas. So this is going to take a lot of example code to explain.
Alpha comparisons
Alphas have their own set of comparison operators. They use the same symbolic operators like ==
and !=
, but they have different meanings. The following table shows the alpha comparison operators and their symbolic equivalents:
- Equal to (== or .EQ.): Returns
true
if operands are equal in content and length or equal in content up to the shorter of the two operands. - Not equal to (!= or .NE.): Returns
true
if operands are not equal in content and length or not equal in content up to the shorter of the two operands. - Greater than (> or .GT.): Returns
true
if the left operand is greater than the right one, considering at most the length of the shortest of the two operands. - Less than (< or .LT.): Returns
true
if the left operand is less than the right one, considering at most the length of the shortest of the two operands. - Greater than or equal to (>= or .GE.): Returns
true
if the left operand is greater than or equal to the right one, considering at most the length of the shortest of the two operands. - Less than or equal to (<= or .LE.): Returns
true
if the left operand is less than or equal to the right one, considering at most the length of the shortest of the two operands.
The alpha type refers to a sequence of characters that has a fixed length. When you compare two alpha operands, you’re comparing them character by character, based on the order of characters in the ASCII character set. The comparison only considers the length of the shorter operand, meaning that if one operand is shorter, only the first n characters (where n is the length of the shorter operand) of the longer operand are considered in the comparison. This is definitely considered odd in the world of programming languages, but it’s stuck that way now. As an example, consider the following expressions:
data alpha1, a6, "ABCDEF"
data alpha2, a3, "ABC"
;this returns true because the first 3 characters of alpha1 are equal to alpha2
alpha1 .eq. alpha2
;because both operands are alpha, == works like .eq above
alpha1 == alpha2
;this returns false because the alphas are a different length
alpha1 .eqs. alpha2
If you want to compare the entire contents of two alpha operands, you can use the alpha comparison operator .EQ.. This operator will return true
only if both operands are the same length and contain the same characters in the same order. For example, the following expression will return false
:
"ABCDEF" .eq. "ABC"
There is no direct way to perform a case insensitive comparison on alphas, but the common approach is to call UPCASE on both operands before comparing them. For example, the following expression will return true
:
upcase(alpha1)
upcase(alpha2)
alpha1 .eq. alpha2
It’s important to note, though, that UPCASE/LOCASE will change the contents of the operands, so if you need to preserve the original values, you’ll need to make a copy of the operands before calling UPCASE/LOCASE.
String comparisons
Unlike alpha, System.String operands are expected to be exactly the size of their intended contents. When one of the operands in a comparison is a System.String, the DBL compiler assumes both operands are System.String or can be converted to System.String and uses the corresponding string comparison operator. This makes string comparisons more like the comparisons seen in other programming languages. Consider the following snippet of code that shows a few string comparisons:
data alpha1, a6, "ABCDEF"
data alpha2, a3, "ABC"
;;this one is true because .eq. of two strings works like .eq. of two alphas
string1 .eq. string2
;;this one is true because both operands are actually the same
string1 .eq. "ABCDEF"
;;these expressions all evaluate to false
string1 .eqs. string2
string1 .eqs. "ABC "
string1 .eqs. alpha2
string1 == string2
string1 == "ABC "
string1 == alpha2
As you can see, it’s pretty easy to get confused. Here’s the same table as above, but you can apply these rules if at least one of the operands is a System.String:
- Equal to (== or .EQS.): Returns
true
if the left and right operands are equal. - Not equal to (!= or .NES.): Returns
true
if the left and right operands are not equal. - Greater than (> or .GTS.): Returns
true
if the left operand is greater than the right one. - Less than (< or .LTS.): Returns
true
if the left operand is less than the right one. - Greater than or equal to (>= or .GES.): Returns
true
if the left operand is greater than or equal to the right one. - Less than or equal to (<= or .LES.): Returns
true
if the left operand is less than or equal to the right one.
In Traditional DBL, there is no direct way to perform a case-insensitive comparison on strings, but the common approach is to call ToUpper/ToLower on both operands before comparing them. Unlike calling UPCASE/LOCASE, this will not modify the contents of the operands. As an example showing a case-insensitive comparison, the following expression will return true
:
string1.ToUpper() == string2.ToUpper()
In DBL running on .NET you can use the System.String comparison method directly. For an example showing a case-insensitive comparison, the following expression will return true
:
string1.Compare(string2, true) == 0
Ternary Operator
TODO: move this into control flow to highlight its similarity to IF/ELSE The ternary operator, also known as the conditional operator, is a concise way to perform simple IF-ELSE logic in a single line of code. It is called “ternary” because it takes three operands: a condition, a result for when the condition is true, and a result for when the condition is false. The general syntax is
condition ? result_if_true : result_if_false
. This operator is incredibly useful for simplifying code when assigning a value to a variable based on a condition. It can make the code more readable by reducing the need for more verbose control structures, especially in situations where the control flow logic is straightforward. However, you should take caution not to overuse the ternary operator or use it in overly complex expressions, as it can lead to code that is difficult to read and understand. Always consider the balance between brevity and clarity in your code.
Here are a few example expressions. Notice from the output that true is 1 and false is 0:
proc
;;alpha comparisons
Console.WriteLine("ABCDEF" .eq. "ABC")
Console.WriteLine("ABCDEF" .eq. "ABD")
Console.WriteLine("ABCDEF" .eq. "abc")
;;string style comparisons
Console.WriteLine("ABCDEF" .eqs. "ABC")
Console.WriteLine("ABCDEF" .eqs. "ABCDEF")
Console.WriteLine("ABCDEF" .eqs. "abcdef")
Console.WriteLine("ABCDEF" .eqs. "FEDCBA")
Console.WriteLine(5 > 8)
Console.WriteLine(5 < 8)
Console.WriteLine(true && false)
Console.WriteLine(true && true)
Console.WriteLine(true || false)
Console.WriteLine(true || true)
Console.WriteLine(5 > 8 ? "how did this happen" : "everything normal")
Output
1 0 0 0 1 0 0 0 1 0 1 1 1 everything normal
Control Flow
IF-THEN-ELSE
IF is the most basic statement that allows for conditional control flow in a DBL program. The IF statement checks a specified condition (a statement that evaluates to a Boolean value), and if the condition is true, it executes an associated code block—e.g., a statement or a BEGIN-END block.
Here’s a basic example:
if x > y
begin
; This statement will be executed if x is greater than y
Console.WriteLine("x is greater than y")
end
An ELSE statement is used in conjunction with an IF statement to provide an alternative branch of execution when the IF condition is not met (i.e., when it evaluates to false). The THEN keyword is required in this case, and it makes the syntax more readable, clearly defining the separate paths of execution. The general structure is this: IF condition THEN statement_1 ELSE statement_2
. You will often see this form in DBL code: IF(condition) statement
. The parentheses can improve readability but are entirely optional.
Here’s a basic example with THEN and ELSE:
if x > y then
begin
; This statement will be executed if x is greater than y
Console.WriteLine("x is greater than y")
end
else
begin
; This statement will be executed if x is not greater than y
Console.WriteLine("x is not greater than y")
end
The ELSE-IF statement makes it possible to specify multiple alternative conditions. It is used after an IF and before a final ELSE statement. If the initial IF condition evaluates to false, the program checks the first ELSE-IF condition. If that ELSE-IF condition evaluates to true, its code block is executed. If not, the program proceeds to the next ELSE statement, if there is one. The THEN keyword is required for the initial IF and each ELSE-IF until the last ELSE or ELSE-IF.
if x > y then
begin
; This statement will be executed if x is greater than y
Console.WriteLine("x is greater than y")
end
else if x = y then
begin
; This statement will be executed if x is equal to y
Console.WriteLine("x is equal to y")
end
else
begin
; This statement will be executed if x is not greater than y and x is not equal to y
Console.WriteLine("x is less than y")
end
Here’s another example that shows when THEN is required and when it is not allowed:
record
fld1, d10, 99999
proc
if(fld1 > 100) then
Console.WriteLine(fld1)
else if(fld1 < 9000) ;;THEN is missing, which causes the first error shown in the output (shown below)
Console.WriteLine("fld1 is on its way up")
else
Console.WriteLine("fld1 was over 9000!")
if(fld1 > 100) then
Console.WriteLine(fld1)
else if(fld1 < 9000) then ;;This THEN causes the second error in the output
Console.WriteLine("fld1 is on its way up")
Compiler output
%DBL-E-INVSTMT, Invalid statement at or near {END OF LINE} : else %DBL-E-NOSPECL, Else part expected : Console.WriteLine("fld1 is on its way up")
A newline is not required between an IF statement’s condition and its code block:
if x > y then
Console.WriteLine("x is greater than y")
else if x = y then Console.WriteLine("x is equal to y")
else Console.WriteLine("x is less than y")
Multi-way control flow
Complex control flow statements can be an improvement over long chains of IF-THEN-ELSE statements. Complex control statements operate in ways that are similar to the “switch” statement in the C family of languages. There are three main variants:
- CASE
- USING
- USING-RANGE
These all have a similar purpose: an expression is evaluated and then one of several possible code blocks (or none at all) is executed depending on the value of the expression. However, there are important differences in their syntax, how they evaluate conditions, and their performance characteristics.
CASE
CASE is the most basic of the multi-way control statements. It selects from a set of unlabeled or labeled code blocks, based on the value of a control expression.
-
Unlabeled CASE - The control expression is non-implied numeric (no decimal point) and is interpreted as an ordinal number 1 through n, where n is the number of statements in the set. For example, a value of 6 selects the sixth code block.
-
Labeled CASE - All code blocks are identified with labels that must be literals and of the same type as the control expression. A label can be a single literal or a range (delineated with a hyphen). The control expression is matched with these labels to select the corresponding code block.
ELSE is used to specify a block of code to be executed if no case labels match the value of the switch expression. It’s akin to an ELSE clause in an IF-THEN-ELSE conditional block. If the ELSE case is not provided and no match is found, the CASE statement will simply do nothing.
record
num, i4
color, a10
proc
num = 2
color = "red"
;;Labeled case
case num of
begincase
1: Console.WriteLine("The number is 1")
2: Console.WriteLine("The number is 2")
3: Console.WriteLine("The number is 3")
endcase
else
Console.WriteLine("The number is some other number")
case num of
begincase
1-5: Console.WriteLine("The number is between 1 and 5")
6-9: Console.WriteLine("The number is between 6 and 9")
10-15: Console.WriteLine("The number is between 10 and 15")
endcase
;;Labeled case - alpha
case color of
begincase
"red": Console.WriteLine("The color is red")
"blue": Console.WriteLine("The color is blue")
"green": Console.WriteLine("The color is green")
endcase
else
Console.WriteLine("The color is something else")
;;Unlabeled case
case num of
begincase
Console.WriteLine("The number is 1")
Console.WriteLine("The number is 2")
Console.WriteLine("The number is 3")
endcase
Output
You entered 2 You entered something between 1 and 5 You entered red You entered 2
USING
The USING statement selects a code block for execution based on the evaluation of a control expression against one or more match term conditions. Each match term is evaluated from top to bottom and left to right. Once a match is found, no other condition is evaluated. If no match is found, the null term (opening and closing parentheses with nothing between them) is used if it is specified. The USING statement is more efficient than CASE when using an i
or d
control expression and when all match terms are compile-time literals.
record
code, a2
proc
code = "AA"
using code select
('0' thru '9'),
begin
;;BEGIN-END can be used here
Console.WriteLine("matched 0 thru 9")
end
('A' thru 'Z'),
Console.WriteLine("matched A thru Z")
("99", '$'),
Console.WriteLine("matched 99 or $")
(.gt.'z'),
Console.WriteLine("matched greater than z")
(),
Console.WriteLine("fell through to the default")
endusing
Output
matched A thru Z
USING-RANGE
The USING-RANGE statement is similar to USING but adds a range for the control expression. This allows you to define a range of values within which the control expression is evaluated, enabling you to supply separate default code blocks for values that fall within the range (by using the %INRANGE label) and values that are outside the range (by using the %OUTRANGE label). The USING-RANGE statement builds a dispatch table at compile time and is typically faster than the USING statement.
In the following example, the range 1-12 is specified for the USING-RANGE statement, so any value from 1 through 12 is considered in range and will invoke the (%INRANGE) code block if there is not a more specific match. The example, however, contains a more specific match (3), so the monthName = "March"
statement is executed. If month
was instead set to 10, the %INRANGE statement would be executed (resulting in “shrug”). But if month
was set to 13, the %OUTRANGE statement would be executed, and the output would be “wild month”.
record
month, int
monthName, a10
proc
month = 3 ; Let's assume the month is March
USING month RANGE 1 THRU 12 SELECT
(1),
monthName = "January"
(2),
monthName = "February"
(3),
monthName = "March"
(4),
monthName = "April"
(%OUTRANGE),
monthName = "wild month"
(%INRANGE),
monthName = "shrug"
ENDUSING
Console.WriteLine(monthName)
Output
March
Mini quiz
What does this program output if
month
is 5 instead of 3?What does this program output if
month
is 5555 instead of 3?
While each of these multi-way control mechanisms has its uses, in most modern coding scenarios, USING tends to be the go-to choice due to its flexibility and powerful matching conditions. The CASE statement is straightforward and simple to use, and you’ll frequently encounter it in legacy code (as it was developed earlier than USING). But it is generally slower. Also, the USING-RANGE statement provides a slight efficiency boost when a control expression is evaluated within a predefined range.
It’s important to consider several factors when deciding which mechanism to use in your specific use case. These include the complexity of your matching conditions, the need for a defined range for the control expression, and the importance of execution speed. However, given its power and versatility, the USING statement is often a sensible default choice for new code.
Loops
Loops are foundational constructs used to automate and repeat tasks a certain number of times or until a specific condition is met. FOR loops are typically used in cases where the exact number of iterations is known beforehand, whereas WHILE and WHILE-DO loops are more suitable when iterations depend on certain conditions. FOREACH is used to improve readability when operating over a collection or dynamic array.
There are other methods to handle repetition in code. For instance, recursion, where a function calls itself, is an alternative that can be more intuitive for certain tasks, such as traversing tree-like data structures. However, recursion can lead to higher memory usage and potential stack overflow errors if it is not used correctly.
Earlier, less structured looping mechanisms, such as the GOTO statement, can be used to jump to different points in the code. Some of these mechanisms can be useful, but GOTO, which provides a great degree of freedom, often leads to “spaghetti code” that is hard to read and maintain due to its lack of structure. We’ll discuss GOTO and other less structured mechanisms in Unconditional control flow below.
FOR-FROM-THRU
The FOR-FROM-THRU loop (FOR variable FROM initial THRU final [BY incr]
) executes a statement as long as the value of a variable (variable) is within the specified range. The variable’s value is incremented after each iteration. The default increment value is 1, but you can specify an increment amount using BY incr
.
record
var, i4
initial, i4, 0
final, i4, 5
proc
for var from initial thru final by 1
begin
Console.WriteLine(var)
end
for var from 5 thru 7 by 1
begin
Console.WriteLine(var)
end
Output
0 1 2 3 4 5 5 6 7
WHILE
The WHILE loop (WHILE condition [DO] statement
) continues as long as the specified condition is true. Once the condition is no longer true, the loop will be exited. DO is optional and has no effect on the loop. But if it is there, it must be on the same line as WHILE. The following example produces the same output as the example FOR-FROM-THRU loop above:
record
var, i4
final, i4, 5
incr, i4, 1
proc
var = 0
;If DO is included, it must be on the same line as WHILE
while (var <= final) do
begin
Console.WriteLine(var)
var += incr
end
var = 5
final = 7
;DO is not necessary
while (var <= final)
begin
Console.WriteLine(var)
var += incr
end
FOREACH-IN
The FOREACH-IN loop (FOREACH loop_var IN collection [AS type]
) iterates over each element in a collection, setting a loop variable (loop_var) to each element in turn and executing the code block. Note that the loop variable must be the same type as the elements in the collection, or an “Invalid cast” exception will occur. We’ll cover collections and arrays in more detail in the Collections chapter.
You can use the DATA statement to declare the iteration variable directly inside a FOREACH loop. If the compiler can’t infer the variable’s type, you will need to specify it using the AS type
syntax, which is discussed below. Here’s an example with and without an inline variable declaration:
record
string_array, [#]String
explicit_iteration_variable, @String
proc
string_array = new String[#] { "hello", "for", "each", "loops" }
foreach data element in string_array
begin
Console.WriteLine(element)
end
foreach explicit_iteration_variable in string_array
begin
Console.WriteLine(explicit_iteration_variable)
end
Output
hello for each loops hello for each loops
Advanced FOREACH-IN features
You can use the AS type
syntax to cast the loop variable to a different type. Explicitly defining the iteration variable’s type in a FOREACH-IN loop is useful when working with untyped collections, such as ArrayList. This allows you to leverage your knowledge of the actual type of items in the collection.
foreach mydecimalvar in arraylist as @int
begin
; statement
end
Similarly, if the collection contains instances of a structure, and the loop variable is of type a
, you could cast the loop variable to the structure’s type:
foreach avar in arraylist as @structure_name
begin
; statement
end
Less common loops
While you will likely encounter the following loop types in your codebase, there are few, if any, non-historical reasons to write these loops into new code.
The DO FOREVER loop (DO FOREVER statement
) endlessly executes a code block until the loop is broken through an EXITLOOP statement, a GOTO statement, or error catching.
do forever
begin
; statement
end
The REPEAT loop (REPEAT statement
), like the DO FOREVER loop, continually executes a code block until control is transferred to some other code due to some condition (e.g., EXITLOOP or GOTO). Having two loop types that work the same is a historical artifact rather than a difference based on some sort of tradeoff.
repeat
begin
; statement
end
A DO-UNTIL loop (DO statement UNTIL condition
) executes a specified code block until a provided condition becomes true. It evaluates the condition after each iteration, and if it’s false, the code block is executed again.
do
begin
; statement
end
until condition
A FOR-DO loop (FOR variable = value[, ...] DO
) executes a code block for each value in a given list. The loop concludes when all values in the list have been assigned to the variable and the code block has been executed for all.
for variable = value, value2, value3 do
begin
; statement
end
The FOR-UNTIL-DO loop (FOR variable = initial [STEP incr] UNTIL final DO
) behaves similarly to the FOR-FROM-THRU loop but allows for modifications to the final value and incr during the loop’s execution.
for var = initial step 1 until final do
begin
; statement
end
Unconditional control flow
Unconditional control flow refers to DBL statements that alter the sequential execution of code without evaluating conditions. These instructions (GOTO, EXIT, EXITLOOP, and NEXTLOOP) jump to a specific point in the code or terminate loops prematurely, regardless of any loop conditions. Because these statements don’t have their own conditions, they are almost always paired with an IF.
The EXIT statement (EXIT[label]
) transfers control to the END statement of the current BEGIN-END block. If there are nested BEGIN-END blocks, you can optionally use a label to specify which block you want to exit. The label corresponds to a label on a BEGIN statement.
The GOTO statement (GOTO label
or GOTO(label[, ...]), selector
) redirects execution control to a specific label. You can specify a single label directly or use a list of labels with a selector. The selector is an expression that selects an element from the list of labels (1 for the first label, 2 for the second, and so on). If the value of the selector is less than 1 or more than the number of labels, execution continues with the code block following the GOTO. You may see the computed GOTO form in your codebase, but it’s best to use one of the more structured control flow options such as USING.
The EXITLOOP statement is used to break out of a loop prematurely. When EXITLOOP is executed, it terminates the current loop (DO FOREVER, FOR, REPEAT, WHILE, etc.), and control is transferred to the statement immediately after the loop.
record
counter, i4, 0
threshold, i4, 5
proc
while (counter < 10)
begin
Console.WriteLine("Current Counter Value: " + counter.ToString())
counter += 1
if (counter >= threshold)
begin
Console.WriteLine("Threshold reached. Exiting loop.")
exitloop
end
end
Use NEXTLOOP when you want to terminate the current iteration of a loop but not all remaining iterations. After executing NEXTLOOP, control goes to the next iteration of the current loop (DO, FOR, REPEAT, WHILE, etc.).
record
counter, i4, 0
threshold, i4, 5
proc
while (counter < 10)
begin
counter += 1
if (counter < threshold)
begin
;Skip to the next iteration
nextloop
end
Console.WriteLine("Current Counter Value: " + counter.ToString())
end
Output
Current Counter Value: 5 Current Counter Value: 6 Current Counter Value: 7 Current Counter Value: 8 Current Counter Value: 9 Current Counter Value: 10
As a best practice, limit or eliminate the use of GOTO, as it can make code difficult to read and maintain. Structured control flow with loops, conditionals, and routine calls is preferable. EXIT and EXITLOOP can be very useful for managing control flow, especially when you need to leave a loop or block due to an error condition or when a certain condition is met. NEXTLOOP is also a handy tool when you want to skip the current iteration and continue with the next one.
Quiz
Consider the following IF construct:
IF condition THEN statement1 ELSE statement2
. What doesstatement2
represent?
- The statement to be executed when the condition is true
- The statement to be executed when the condition is false
- The condition to be checked after the initial condition is checked
- The default statement that is always executed
In an IF construct, are parentheses around the condition required?
- Yes, the condition must always be enclosed in parentheses
- Yes, but only when using the ELSE IF clause
- No, parentheses can improve readability but are entirely optional
- No, parentheses are not allowed in the IF construct
Which statement about THEN in DBL is correct?
- THEN is always required in IF and ELSE IF statements
- THEN is only required in IF statements
- THEN is never required in DBL
- THEN is required if another ELSE or ELSE IF will follow, but it is not allowed on the last one
Consider you have a piece of code where you need to execute different blocks of code based on the value of a single variable. Which control flow structures are the most appropriate for this purpose in DBL?
- IF, ELSE IF, ELSE
- USING, CASE
- FOR, WHILE
- BEGIN, END
What is the purpose of the ELSE clause in a CASE control flow statement?
- It provides a condition to be checked if no prior conditions have been met
- It acts as the default case that is always executed
- It specifies a block of code to be executed if no case labels match the value of the switch expression
- It causes the program to exit the CASE statement if no match is found
Combining Comparisons
In the previous sections, we explored the fundamentals of relational comparisons, such as ==
and >
, and delved into the essentials of control flow. Building on that foundation, this section, “Combining Comparisons,” compares what life would be like with and without Boolean operators. Seeing the fully expanded long-form explanation will hopefully make it easier for you to reason about complex logical expressions and also help you break down any overly complex expressions you encounter in your codebase.
Boolean operators
In addition to comparison operators, we have Boolean operators. They compare the truth value of operands and return true
or false
just like comparison operators. Here they are:
- OR (
||
or.OR.
): Returnstrue
if either operand is true. - Exclusive OR (
.XOR.
): Returnstrue
if exactly one operand is true. - AND (
&&
or.AND.
): Returnstrue
if both operands are true. - NOT (
!
or.NOT.
): Returnstrue
if the operand is false.
Like most programming languages, DBL evaluates Boolean operators from left to right. If the result can be determined by the left operand, DBL won’t process the right operand. This is known as short-circuit evaluation.
To better understand these operators, let’s take a look at some basic examples contrasted with their equivalent code without the operators.
Using the &&
operator
With &&:
data isAdult = true
data hasTicket = true
if (isAdult && hasTicket)
Console.WriteLine("Access granted.")
Without &&:
data isAdult = true
data hasTicket = true
if (isAdult)
begin
if (hasTicket)
Console.WriteLine("Access granted.")
end
Explanation:
- With
&&
: The IF statement checks both conditions (isAdult
andhasTicket
) in a single line. If both are true, the message “Access granted.” is printed. - Without
&&
: We use nested IF statements. The outer IF checksisAdult
, and the inner IF checkshasTicket
. The same result is achieved, but with more verbose code.
Using the ||
operator
With ||:
data isRainy = true
data isSnowy = false
if (isRainy || isSnowy)
Console.WriteLine("Take an umbrella.")
Without ||:
data isRainy = true
data isSnowy = false
if (isRainy) then
Console.WriteLine("Take an umbrella.")
else if (isSnowy)
Console.WriteLine("Take an umbrella.")
Explanation:
- With
||
: The IF statement checks if eitherisRainy
orisSnowy
is true. If either condition is met, the message “Take an umbrella.” is printed. - Without
||
: We use separate IF and ELSE IF statements to check each condition independently. The message is printed if either condition is true, but the code is less concise.
Let’s compare the use of the !
(logical NOT) and ^
(logical XOR, exclusive OR) operators in C# with alternative implementations using IF ELSE statements.
Using the !
operator
With !:
data isClosed = true
if (!isClosed)
Console.WriteLine("The door is open.")
Without !:
data isClosed = true
if (isClosed) then
begin
;; Do nothing if the door is closed
end
else
Console.WriteLine("The door is open.")
Explanation:
- With
!
: The!
operator inverts the Boolean value ofisClosed
. The IF statement checks ifisClosed
is not true (i.e., false), and if so, prints the message. - Without
!
: We use an IF ELSE statement. The IF part is essentially a placeholder, and the ELSE part handles the case whenisClosed
is false, printing the message.
Using the .XOR.
operator
With .XOR.:
data switch1 = true
data switch2 = false
if (switch1 .XOR. switch2)
Console.WriteLine("The light is on.")
Without .XOR.:
data switch1 = true
data switch2 = false
if (switch1) then
begin
if (switch2) then
begin
end
else
Console.WriteLine("The light is on.")
end
else
begin
if (switch2)
Console.WriteLine("The light is on.")
end
Explanation:
- With
.XOR.
: The.XOR.
operator performs an exclusive OR operation. It returnstrue
if exactly one ofswitch1
orswitch2
is true. The message is printed when this condition is met. - Without
.XOR.
: We use nested IF ELSE statements. The first IF checksswitch1
, and its nested IF checksswitch2
. The ELSE part handles the case whenswitch1
is false. This approach is more verbose and less straightforward compared to using the.XOR.
operator.
Routines
A function, subroutine, or method is a self-contained block of code designed to perform a specific task. The task could be anything, from complex mathematical operations to manipulating data or creating output. Each routine is given a name, and this name is used to call or invoke the routine at different points in a program.
Routines usually take inputs, known as “arguments” or “parameters,” and return one or more outputs or return values. The inputs are values that the routine operates on, and the output is the result of the routine’s operation.
One key feature of functions is that they promote reusability and organization in code. If you have a task that needs to be performed multiple times throughout a program, you can define a function for that task and then call the function whenever the task needs to be performed. This helps reduce repetition and makes the code easier to maintain and understand.
While you may find that a significant portion of your existing application code consists of large routines, opting for smaller, more precise routines over comprehensive “kitchen sink” ones comes with a host of benefits.
First, smaller routines are generally easier to understand and maintain. Each routine has a specific, well-defined purpose, which can be described by its name and the names of its parameters. When you need to modify a routine or diagnose an issue, you can focus on a smaller amount of code that’s specific to one task, rather than having to navigate a larger, more complex routine that handles many tasks.
Second, smaller routines promote code reusability. If a routine performs a single, well-defined task, it’s likely that task could be needed elsewhere in your program or even in other programs. By keeping your routines small and focused, you make it easier to reuse your code, reducing duplication and making your overall codebase more efficient.
Third, smaller routines are easier to test. You can write unit tests for each routine that cover its expected behavior, handling of edge cases, and error conditions. This would be far more challenging with a larger routine where different tasks are intertwined.
Finally, decomposing a problem into smaller parts can often make the problem easier to solve and the solution easier to reason about. It’s a form of “divide and conquer” strategy that’s often very effective in programming.
Subroutines
Subroutines are self-contained blocks of code designed to perform specific tasks. They are similar to functions with a void return type in other languages. There are two types of subroutines in DBL: subroutines and local subroutines.
A subroutine, sometimes referred to as an external subroutine, is a separate entity from the routine that calls it. This subroutine can be present in the same source file as the invoking routine or in a different file.
To define a subroutine, we use the SUBROUTINE statement. The return point to the calling routine is determined by the RETURN or XRETURN statement. Using XRETURN is recommended because it supersedes any nested local subroutine calls. If an exit control statement is not explicitly specified, a STOP statement is implied at the end of the subroutine.
Parameters for subroutines are listed right after the SUBROUTINE statement, using a similar format to field declarations. The subroutine outlines the data type of each parameter, but the size of the argument is defined by the calling program. Note that default parameter values can only be declared when targeting .NET.
subroutine mult
a_result ,n
a_arg1 ,n
a_arg2 ,n
proc
a_result = a_arg1 * a_arg2
xreturn
endsubroutine
A subroutine is invoked using the XCALL statement. In Traditional DBL, if the first parameter is not an alpha, these subroutines can also be used as functions (i.e., in the form %subroutine). This requires the first argument passed to contain the result of the operation and the subroutine to be declared as an external function in your code.
Implicit stop
In Traditional DBL, if a subroutine reaches its end without an XRETURN or RETURN, it’s treated as an implicit STOP statement, which results in program termination. This often leads to significant frustration among new developers who are taken aback by this behavior. However, altering this behavior would break backward compatibility for code that relies on it.
Local subroutines
A local subroutine resides in the same method, function, or subroutine in which it’s called. It’s located between the PROC statement and its corresponding END statement of the calling routine, starting with a label and ending with the RETURN statement. Local subroutines don’t accept arguments but share the calling routine’s data. This makes local subroutines function much like class members or the captured variables of a lambda in other languages, allowing for data sharing within a scope.
To invoke a local subroutine, we use the CALL statement. After returning from the call, the processing continues at the line immediately following the CALL statement. This provides a level of encapsulation and data sharing within a single routine or function.
TODO - add example and diagram to help understand RETURN vs XRETURN/FRETURN/MRETURN
Functions
A function is a structured block of code designed to perform a specific task. It can return a value and can be invoked from anywhere a literal is allowed in an expression. The function name may optionally be preceded by a percent sign (%). An example invocation of a function could look like this:
value=%function(arguments)
if (%function(arguments))
xcall subroutine(arg1, %function(arguments), arg3)
Functions in DBL can be invoked in a unique manner where the return value is used as an additional argument. For example, a function call that appears as retval = %myRoutine(arg1, arg2)
can also be invoked using xcall myRoutine(retval, arg1, arg2)
.
Functions are declared using the FUNCTION statement, followed by data and procedure divisions. The FRETURN statement is used to terminate the function and return control to the caller.
A function name in Traditional DBL can technically be up to 255 characters long. However, names longer than 30 characters get truncated upon linking, so it’s important to ensure that the first 30 characters are unique.
Modifiers like LOCAL, STACK, and STATIC can specify the default state of unqualified RECORD statements within a function. These modifiers are mutually exclusive and default to LOCAL in Traditional DBL, or STACK in DBL running on .NET, unless another modifier is specified.
The REENTRANT modifier allows a function to be called recursively, and it also changes the default for unqualified RECORD statements to STACK. This is reflected in the following statement:
function fred ,reentrant
TODO example to demonstrate a non reentrant function - maybe just explain why this thing exists and what you can do to get rid of it
Here, all unqualified RECORD statements in function fred
behave as STACK RECORD statements.
The VARARGS modifier, which is optional for unprototyped functions and subroutines, is required when you want to pass more arguments than declared while using strong prototypes or running on .NET. We will cover prototyping extensively in a later chapter.
Traditional DBL notes
On Windows and Linux, functions not in the calling chain may be unloaded from memory by default. The RESIDENT modifier, if specified, keeps the function in memory.
In Traditional DBL, all expression results are rounded by default, although this behavior can be changed to truncation by enabling system option #11. However, specifying the ROUND or TRUNCATE option on a FUNCTION statement overrides this default behavior for that function. It’s important to note that these options cannot coexist in the same statement and aren’t supported in DBL running on .NET.
DBL on .NET notes
Functions can be declared inside or outside of a namespace or class. Functions declared outside of a class are considered global. In DBL running on .NET, the compiler creates a new global class,
_CL
, for these global functions during the .exe or .dll file generation. If a function is also declared outside of a namespace, it is housed in_NS_assemblyname._CL
.All routines are reentrant, and LOCAL is synonymous with STACK when running on .NET
Defining subroutines and functions
Function
[access] [function_mod ...] FUNCTION name[, return_type][, option, ...] parameter_def . . . [ENDFUNCTION|END]
Subroutine
[access] [subroutine_mod ...] SUBROUTINE name[, option, ...] parameter_def . . . [ENDSUBROUTINE|END]
Similarities between function and subroutine declarations
- access (optional): Sets access levels when defined inside a class. Options include PUBLIC, PROTECTED, PRIVATE, INTERNAL, and PROTECTED INTERNAL.
TODO Add more details
-
function_mod / subroutine_mod (optional, DBL on .NET only)
- STATIC: Accessed without a reference object.
- VARARGS: Accepts more arguments than declared.
-
name: The name of the function or subroutine.
-
options / option (optional)
- LOCAL / LOCALDATA: Retains record content for the duration of function/subroutine activation.
- STACK / STACKDATA: Provides unique record content for each activation of function/subroutine.
- STATIC / STATICDATA: Retains record content through all function/subroutine activations.
- REENTRANT: Allows multiple active instances of the function/subroutine.
- RESIDENT: Keeps function/subroutine in memory when not in use (Windows, Linux only).
- ROUND: Rounds implied-decimal data types within function/subroutine.
- TRUNCATE: Truncates implied-decimal data types within function/subroutine (Traditional DBL only).
- VARARGS: Accepts more arguments than declared.
-
parameter_def: Definition of parameters.
Function specific
- return_type (optional): Specifies the return type of the function. Defaults to the type of the variable or literal of the first FRETURN statement.
- size (optional): Specifies the size of the function return value.
Parameters
The direction of data flow for parameters can be controlled using certain parameter modifiers: IN, OUT, and INOUT.
-
IN: Parameters declared with the IN modifier are read-only and meant to be used for input into a subroutine, function, or method. The parameter can be used for reading but cannot be modified, even for local computations within the routine. This is particularly crucial for types like structures, alphas, decimals, integers, and numeric or implied-decimal values that are passed by reference, as it ensures the original data in the calling context isn’t inadvertently altered.
-
OUT: Parameters with the OUT modifier are designed to return data from a subroutine, function, or method back to the calling context. These parameters do not retain their initial value and are expected to be assigned a new value within the called routine before it returns. This effectively provides a way to “return” more than one value from a subroutine or function.
-
INOUT: When the INOUT modifier is applied, it allows a parameter to serve a dual role—as both an input (like an IN parameter) and an output (like an OUT parameter). These parameters can accept input and, after possible modifications within the routine, return output.
By default, subroutines and functions in DBL treat parameters as “unspecified,” which means they can be used for both input and output, akin to INOUT. However, not specifying direction carries more risk, because it doesn’t protect against unintentional modifications to parameters that are not meant to be altered, such as temporary values or literals. Therefore, it’s advisable to always specify the direction of a parameter explicitly.
In Traditional DBL, arguments can be passed to subroutines and functions in one of three ways: by descriptor, by reference, or by value.
-
By descriptor: This mode is the default and most common way of passing arguments. In this mode, any changes made to the variable linked with the parameter in the receiving routine are also reflected in the argument’s value in the calling routine.
-
By reference: This mode is primarily used when passing arguments to non-DBL routines. It functions similarly to passing by descriptor; any modifications made to the variable linked with the parameter in the receiving routine are also reflected in the argument’s value in the calling routine.
-
By value: This mode passes the value of the argument such that any modifications made to the parameter in the routine do not affect the argument’s value in the calling routine. Objects and ^VAL are passed this way for IN parameters.
In DBL running on .NET, two familiar mechanisms for passing parameters are used: by value (BYVAL) and by reference (BYREF). These behave the same way as their counterparts in other languages running on .NET.
-
BYVAL: This mode passes a copy of the variable to the routine. This means that any changes made to the parameter inside the routine do not affect the original argument.
-
BYREF: This mode is used for objects and passes a reference to the object. This means that if the routine modifies the value, the change is reflected in both the calling and the called routine.
When DBL is running on .NET, arguments that would have been passed by descriptor in Traditional DBL are actually object handles passed by value. This implementation detail aligns with .NET’s standard practice of passing object handles by value. Because of the semantics of IN, OUT, and INOUT, this detail is hidden unless you dig into the guts.
record
value1, int
value2, int
value3, int
value4, int
proc
value1 = 10
value2 = 20
value3 = 30
value4 = 0
xcall InParameter(value1)
xcall OutParameter(value2)
xcall InoutParameter(value3)
value4 = %ReturnValue()
Console.WriteLine("Value1 (in): " + %string(value1)) ; Expected output: 10
Console.WriteLine("Value2 (out): " + %string(value2)) ; Expected output: 100
Console.WriteLine("Value3 (inout): " + %string(value3)) ; Expected output: 60
Console.WriteLine("Value4 (return): " + %string(value4)) ; Expected output: 99
end
subroutine InParameter
in param, int
proc
; Changes here will result in a compiler error
; DBL-E-READONLY, Cannot write to read-only data : param = param * 2
; param = param * 2
xreturn
endsubroutine
subroutine OutParameter
out param, int
proc
param = 100 ; The original value is overwritten
xreturn
endsubroutine
subroutine InoutParameter
inout param, int
proc
param = param * 2 ; Changes here will affect the original value
xreturn
endsubroutine
function ReturnValue, int
proc
freturn 99
endfunction
Output
Value1 (in): 10 Value2 (out): 100 Value3 (inout): 60 Value4 (return): 99 %DBR-S-STPMSG, STOP
Mismatch
The MISMATCH modifier provides flexibility with weakly typed systems, permitting you to bypass type checking and pass variables of one type as arguments to parameters of a different type without raising a prototype mismatch error.
When MISMATCH is used with an alpha, numeric, decimal, or implied-decimal parameter type, it allows the program to interchangeably pass either alpha or numeric type arguments to that parameter. This feature provides a certain level of freedom, but it must be used with care to avoid unexpected behaviors or errors.
Using MISMATCH with a numeric parameter is only recommended for routines that either pass the parameter as an argument to another routine also marked MISMATCH numeric or where the data type is explicitly controlled with casting. Using MISMATCH numeric in other situations might lead to unexpected results, especially when running DBL on .NET.
For instance, when an alpha parameter is passed to a MISMATCH numeric parameter, it’s interpreted as decimal in Traditional DBL but remains alpha in DBL on .NET, leading to subtle differences in behavior. To safely pass an alpha to a numeric parameter, consider using explicit casting unless the routine uses MISMATCH numeric.
When you’re dealing with a MISMATCH alpha parameter and expecting a decimal parameter to be treated as an alpha, use casting to explicitly control the datatype.
In situations where you want to pass a decimal variable to a routine with an alpha parameter and you’re not using explicit casting when writing to it, using MISMATCH alpha is not advisable. Instead, convert the routine to use a numeric parameter and use MISMATCH numeric, along with appropriate casting, when the parameter is used as an alpha.
TODO Note because of the usefulness in resolving common prototyping errors for legacy code, this is worth an extensive example, most importantly including the unexpected results from a poor mismatch choice vs an un-prototyped mismatched parameter. Also I need a deeper pass over to inject the why around mismatch parameters and maybe reduce some of the bland fluff
proc
Console.WriteLine("lookalike data")
xcall mismatched_params(5, "5", "5.5")
Console.WriteLine("correctly typed data")
xcall mismatched_params("5", 5, 5.5)
xreturn
end
subroutine mismatched_params
mismatch param1, a
mismatch param2, n
mismatch param3, n.
record
idField, d3.1
proc
Console.WriteLine(param1)
Console.WriteLine(%string(param2 + 5))
Console.WriteLine(%string(param3 + 5.5))
idField = param3
Console.WriteLine(%string(idField))
xreturn
endsubroutine
Output
lookalike data 5 10 650.5 >5.0 correctly typed data 5 10 11.0 5.5 %DBR-S-STPMSG, STOP
Optional vs default
Optional parameters and parameters with default values both offer flexibility when invoking functions or methods. However, their behavior differs when it comes to determining if an argument was passed.
Optional parameters can be omitted from the function or method call. You can use the built-in function ^PASSED to check at runtime whether an argument for an optional parameter has been supplied. If ^PASSED returns false, it indicates that no argument was provided for the parameter in the function call. Here’s an example of ^PASSED in action:
subroutine sub
arg1 ,a ;Optional parameter
arg2 ,a ;Optional parameter
arg3 ,a ;Optional parameter
proc
Console.WriteLine(^passed(arg1))
Console.WriteLine(^passed(arg2))
Console.WriteLine(^passed(arg3))
endsubroutine
main
proc
xcall sub(,"hi")
endmain
Output
0 1 0
For leading or middle parameters, you can use ,
without any argument to indicate a parameter is not passed. You can do the same thing for trailing arguments, but as you can see in the example above, you can also just omit them entirely.
On the other hand, parameters with a default value are technically always supplied an argument. If no explicit argument is passed in the function call, the default value is used. As a result, ^PASSED will always return true for these parameters, indicating that an argument, even if it’s the default one, was provided. This behavior effectively makes these parameters a hybrid between optional and required parameters.
Methods, properties, lambdas, delegates
These are function-like things, and we will describe them in much more detail in later chapters . You’ve already seen at least one example of a method: Console.WriteLine
is a static method on a class named System.Console
.
Preprocessor
The preprocessor handles several compiler directives that begin with a period (.) and processes them before the actual compilation occurs. Let’s take a look at the tasks that can be performed using the preprocessor:
Text replacement with .DEFINE
Think of .DEFINE as creating a search-and-replace rule for your code. It comes in two forms:
Simple replacement:
.DEFINE TTCHN, 1
This tells the compiler “whenever you see TTCHN, replace it with 1.” It’s similar to creating a constant but happens during preprocessing. A practical use would be
.DEFINE MAX_USERS, 100
.DEFINE DATABASE_PATH, "data/users.dat"
Parameterized macros:
.DEFINE SUBTOTAL(desc, amount) writes(pchan, "Subtotal for ''desc': "+%string(amount))
This creates a more sophisticated replacement pattern that accepts parameters. It’s like creating a template that can be filled in with different values. When you write
SUBTOTAL(Apples, apples_total)
the preprocessor expands it to
writes(pchan, "Subtotal for Apples: "+%string(apples_total))
Limitations
Preprocessor expansion in DBL is limited to a single statement (i.e., a single line). This means that the following attempt to expand three statements is not possible with a single macro:
.DEFINE SUBTOTAL(desc, amount) open(pchan, O, "TT:") & writes(pchan, "Subtotal for ''desc': "+%string(amount)) & close(pchan)
Conditional compilation
DBL offers three main ways to conditionally include or exclude code during compilation:
.IF for expression-based decisions:
.IF user_count > 100
writes(1, "Large user base detected")
.ELSE
writes(1, "Small user base")
.ENDC
.IFDEF for checking definitions:
.DEFINE DEBUG_MODE
.IFDEF DEBUG_MODE
writes(1, "Debug: Entering main routine")
.ENDC
.IFNDEF for checking missing definitions:
.IFNDEF ERROR_HANDLER
.DEFINE ERROR_HANDLER
; Define default error handling
.ENDC
Including external code
The .INCLUDE directive helps organize code by allowing you to split it across multiple files:
.INCLUDE "database_config.dbl"
.INCLUDE "LIBRARY:common_routines"
Best practices:
- Use UPPERCASE for defined constants to distinguish them from regular variables.
- Define constants at the beginning of your code or in a separate include file.
- Use parameterized macros for repeated code patterns that need slight variations.
- Use conditional compilation to handle different environments (development, production) or feature flags.
- Structure your includes to avoid circular dependencies.
For example, a well-organized DBL program might look like this:
.DEFINE VERSION, "1.0.0"
.DEFINE MAX_CONNECTIONS, 50
.DEFINE LOG(message) writes(log_channel, "["+ %date() +"] "+ message)
.IFNDEF PRODUCTION
.DEFINE DEBUG_MODE
.ENDC
.INCLUDE "config.dbl"
main
record
connection_count, i4
proc
.IFDEF DEBUG_MODE
LOG("Application starting in debug mode")
.ENDC
if connection_count > MAX_CONNECTIONS
LOG("Connection limit exceeded")
.ENDC
end
Quiz Answers
Variables
- What are the two main divisions in a DBL program?
- Procedure division and memory division
- Data division and procedure division
- Data division and memory division
- Procedure division and static division
- What keyword allows for variable declarations within the procedure division in recent versions of DBL?
- Var
- Data
- Local
- Static
- What are the storage specifiers available for records in DBL?
- Static, local, and global
- Stack, global, and local
- Stack, static, and local
- Local, global, and data
- What is the primary difference between “stack” and “static” variables in DBL?
- Stack variables are shared across the program, while static variables are unique to each function
- Static variables retain their value between function calls, while stack variables are deallocated when the scope is exited
- Stack variables persist until the program ends, while static variables are deallocated when the scope is exited
- Static variables are shared across the program, while stack variables are unique to each function
- How are “data” variables stored in DBL?
- They are stored statically
- They are stored locally
- They are stored on the stack
- They are stored globally
- What data containers in DBL can hold multiple related data items of various types?
- Functions
- Groups
- Records
- Variables
- True or False: It’s mandatory to put ENDRECORD at the end of a record declaration in DBL.
- False. It’s not mandatory to put ENDRECORD at the end of a record declaration in DBL, but it is considered good form.
Primitives
-
What are the three forms of alpha types in DBL?
- a, alpha, a10
- a, a*, a{size}
- a, a10, a*
- a, a*, a{size}+
-
What’s the maximum length of alpha types in 32-bit Windows and Linux when running Traditional DBL?
- 65,535 characters
- 32,767 characters
- 2,147,483,647 characters
- 2,147,483,648 characters
-
In Traditional DBL, how is a byte represented?
- Eight-bit unsigned integer
- Eight-bit signed integer
- Sixteen-bit signed integer
- Sixteen-bit unsigned integer
-
What is the main difference between the decimal and implied-decimal types in DBL?
- Decimal types are unsigned; implied-decimal types are signed.
- Decimal types are whole numbers; implied-decimal types have a sized precision.
- Decimal types have a fractional precision; implied-decimal types are whole numbers.
- Decimal types are signed; implied-decimal types are unsigned.
-
What is the default value for the .NET System.Double structure in DBL?
- 1.0
- 0.0
- NULL
- Undefined
-
Which types does short map to in Traditional DBL and DBL on .NET, respectively?
- i2 and System.Int32
- i2 and System.Int64
- i2 and System.Int16
- i4 and System.Int16
-
When defining a record, when must you include a size for these DBL types: a, d, i, id?
- Only when the fields need to be stored in an array
- Only when the fields will be processed in a loop
- When the fields define a memory layout (this is most of the time)
- When the fields are initialized with a certain value
-
In DBL, when is the size of parameters and non-^VAL return types determined?
- At compile time
- At runtime
- When they are initialized
- When they are declared
-
What happens when you pass the string “a slightly longer alpha” to a routine with an unsized alpha parameter
myFld, a
?- An error occurs as the parameter is unsized
- The parameter myFld will have a size of 23
- The parameter myFld will have a size of 24, including the end character
- The program will crash
Control Flow
Mini quiz
What does this program output if `month` is 5 instead of 3?
- shrug
What does this program output if `month` is 5555 instead of 3?
- "wild month"
-
Consider the following IF construct:
IF condition THEN statement1 ELSE statement2
. What doesstatement2
represent?- The statement to be executed when the condition is true
- The statement to be executed when the condition is false
- The condition to be checked after the initial condition is checked
- The default statement that is always executed
-
In an IF construct, are parentheses around the condition required?
- Yes, the condition must always be enclosed in parentheses
- Yes, but only when using the ELSE IF clause
- No, parentheses can improve readability but are entirely optional
- No, parentheses are not allowed in the IF construct
-
Which statement about THEN in DBL is correct?
- THEN is always required in IF and ELSE IF statements
- THEN is only required in IF statements
- THEN is never required in DBL
- THEN is required if another ELSE or ELSE-IF will follow, but it is not allowed on the last one
-
Consider you have a piece of code where you need to execute different blocks of code based on the value of a single variable. Which control flow structures are the most appropriate for this purpose in DBL?
- IF, ELSE IF, ELSE
- USING, CASE
- FOR, WHILE
- BEGIN, END
-
What is the purpose of the ELSE clause in a CASE control flow statement?
- It provides a condition to be checked if no prior conditions have been met
- It acts as the default case that is always executed
- It specifies a block of code to be executed if no case labels match the value of the switch expression
- It causes the program to exit the CASE statement if no match is found
Fetching Fun: A Simple HTTP Client Project
Now that you have a good understanding of routines, variables, control flow, and strings, let’s put them to use in a simple HTTP client project. We’ll use this project to explore the HTTP and JSON APIs in DBL. You’ll see a few new APIs and some very basic file I/O. We aren’t going to go into complete detail on the APIs we use here, but we’ll give you enough to get started. You can find more information on the APIs we use in the Synergy DBL Language Reference.
HTTP Routines
To show off the HTTP document transport API in DBL, we’re going to make use of httpbin.org
, which contains sample HTTP endpoints that you can use for testing. Specifically, we’ll use three endpoints: the HTML endpoint, the Anything endpoint, which echoes the HTTP request data, and the Status endpoint, which returns a given HTTP status code. If you’re in .NET, you can of course use the HttpClient class directly, but there are loads of examples of that on the internet already, so we’ll show how to use the Traditional DBL HTTP routines here.
Using the Anything endpoint
The Anything endpoint will echo back any data we send to it. This is useful for understanding what data is being passed in your HTTP request.
Make a GET request
Let’s start with a simple GET request.
record
response, @string
errtxt, @string
status, int
responseHeaders, [#]string
proc
status = %http_get("https://httpbin.org/html",5,response,errtxt,^NULL,responseHeaders,,,,,,,"1.0")
Console.WriteLine(response)
This code sends a GET request to the HTML endpoint and prints the response. We aren’t doing anything with the response headers, there’s no error handling, and we’re requesting the HTTP 1.0 protocol.
Make a POST request
Now, let’s try a POST request with some JSON data.
record
response, @string
request, @string
errtxt, @string
status, int
responseHeaders, [#]string
requestHeaders, [#]string
proc
request = '{"key": "value"}'
requestHeaders = new string[#] { "Content-Type: application/json" }
status = %http_post("https://httpbin.org/anything",5,request, response,errtxt,requestHeaders,responseHeaders,,,,,,,"1.0")
Console.WriteLine(response)
This sends a POST request with some hard-coded JSON data and prints the response.
Using the Status endpoint
The Status endpoint returns a response with the HTTP status code you specify. This is useful for testing how your code handles different HTTP responses.
Request a specific status code
Let’s request a 404 status code.
record
response, @string
errtxt, @string
status, int
responseHeaders, [#]string
proc
status = %http_get("https://httpbin.org/status/404",5,response,errtxt,^NULL,responseHeaders,,,,,,,"1.0")
Console.WriteLine("status was: " + %string(status) + " error text was: " + errtxt)
This will print 404
, indicating the status code of the response.
Handling different status codes
Experiment with different status codes to see how your HTTP client handles them. For example, try 200, 400, 500, etc.
record
response, @string
errtxt, @string
status, int
responseHeaders, [#]string
codes, [#]int
proc
codes = new int[#] { 200, 400, 500 }
foreach data code in codes
begin
status = %http_get("https://httpbin.org/status/" + %string(code),5,response,errtxt,^NULL,responseHeaders,,,,,,,"1.0")
Console.WriteLine("request code was: " + %string(code) + " result status was: " + %string(status) + " error text was: " + errtxt)
end
This code loops through a list of status codes, makes a request for each one, and prints the status code of the response. You can see that HTTP 200 is treated specially, resulting in a 0 status code. 200 is the only status code that will result in a 0 status code; all other status codes will be returned as expected.
Writing a file
Let’s call the image endpoint and write the response to a file. This process can be used to download any kind of file, not just images. Keep in mind, though, that this is not a streaming API, so you’ll need to have enough memory to hold the entire file in memory.
record
response, @string
errtxt, @string
status, int
responseHeaders, [#]string
fileChan, int
i, int
proc
status = %http_get("https://httpbin.org/image/png",5,response,errtxt,^NULL,responseHeaders,,,,,,,"1.0")
;open "test.png" in output mode in the current directory
open(fileChan=0, O, "test.png")
;chunk the response into 1024 byte chunks and write them to the file
while(i < response.Length)
begin
if(i + 1024 > response.Length) then
puts(fileChan, response.Substring(i, response.Length - i))
else
puts(fileChan, response.Substring(i, 1024))
i += 1024
end
;dont forget to close the file!
close(fileChan)
DBL I/O routines operate on alphas, not strings. Because alphas have length limitations, we’ll need to write the file in chunks. This code loops through the response and writes it to a file in 1024-byte chunks. Additionally because we don’t want any newlines or other characters inserted into the file, we’ll use the PUTS statement instead of WRITES. With any luck, you should now have a file called test.png in the directory where you’re running your program, and that .png file should have a picture of a pig in it.
Now that we’ve got the basics down, let’s try something a little more complicated and start calling a REST API to get some JSON data.
Interacting with a REST API to Retrieve JSON Data
Now that you’re familiar with basic HTTP interactions using DBL, let’s delve into something more complex: calling a REST API and processing JSON data. We’ll use jsonplaceholder.typicode.com
for this purpose, a fake online REST API often used for testing and prototyping. Specifically, we’ll retrieve a list of posts and then process the JSON response using the System.Text.Json.JsonDocument
API in DBL.
Imports
First, make sure you have included the necessary namespace:
import System.Text.Json
Fetching data from a REST API
We’ll start by creating a function to make a GET request to the /posts
endpoint of JSONPlaceholder, which returns a list of sample blog posts in JSON format.
function GetPosts, @string
record
response, @string
errtxt, @string
status, int
responseHeaders, [#]string
proc
status = %http_get("https://jsonplaceholder.typicode.com/posts", 5, response, errtxt, ^NULL, responseHeaders, , , , , , , "1.0")
freturn response
endfunction
This code sends a GET request to the /posts
endpoint and prints the JSON response.
Processing JSON data
Now, let’s process the JSON data we received. We’ll call our function and use System.Text.Json.JsonDocument
for parsing the JSON string.
record
jsonDoc, @JsonDocument
jsonElement, @JsonElement
post, @string
arrayIterator, int
arrayLength, int
proc
jsonDoc = JsonDocument.Parse(%GetPosts())
arrayLength = jsonDoc.RootElement.GetArrayLength()
for arrayIterator from 0 thru arrayLength-1 by 1
begin
jsonElement = jsonDoc.RootElement[arrayIterator]
post = jsonElement.GetProperty("title").GetString()
Console.WriteLine("Post Title: " + post)
end
In this snippet, we parse the JSON response into a JsonDocument
. Then, we iterate over the array of posts, extracting and printing the title of each post. It’s important to note that in Traditional DBL, you must keep the jsonDoc variable in scope for the lifetime of the jsonElement variable. This is because the jsonElement variable is a reference to the jsonDoc variable. If you don’t keep the jsonDoc variable in scope, the jsonElement variable will be invalid.
There’s a lot more you can do with the JSON API, and typicode.com has a lot more endpoints you can play with. If you’re running .NET, you can look at the Microsoft documentation for the System.Text.Json.JsonDocument API, Microsoft documentation. If you’re running Traditional DBL, you can look at the Synergex documentation for the Json.JsonDocument class, Synergex documentation.
Writing data with Utf8JsonWriter
Utf8JsonWriter is borrowed from .NET and provides a high-performance way to write JSON data. Because Traditional DBL doesn’t have streams, we’ll write a new program and use System.StringBuilder as our output target and then fling our JSON data at httpbin.org to show it off.
Imports
First, make sure you have included the necessary namespaces:
import System.Text
import System.Text.Json
Initializing Utf8JsonWriter
Utf8JsonWriter writes JSON data to an output buffer. In this example, we’ll use a stream as our output buffer.
main
record
outputBuffer, @StringBuilder
jsonWriter, @Utf8JsonWriter
proc
outputBuffer = new StringBuilder()
jsonWriter = Utf8JsonWriter.CreateUtf8JsonWriter(outputBuffer)
endmain
This code initializes a Utf8JsonWriter that writes to our StringBuilder buffer.
Writing JSON data
Let’s create a simple JSON object with a few properties.
In order to start the JSON object, we need to call WriteStartObject()
.
jsonWriter.WriteStartObject()
This begins our JSON object. Now we can add some properties to the JSON object.
jsonWriter.WriteString("name", "John Doe")
jsonWriter.WriteNumber("age", 30)
jsonWriter.WriteBoolean("isMember", true)
These lines add a string, a number, and a Boolean property to the JSON object. We can now conclude the JSON object writing.
jsonWriter.WriteEndObject()
It’s important to flush the Utf8JsonWriter to ensure all data is written to the stream and then close it.
jsonWriter.Flush()
Doing something with the data
Now that we’ve written some JSON data to our StringBuilder buffer, let’s do something with it. We’ll send it to the httpbin.org
Anything endpoint to see what we’ve written.
main
record
outputBuffer, @StringBuilder
jsonWriter, @Utf8JsonWriter
response, @string
request, @string
errtxt, @string
status, int
responseHeaders, [#]string
requestHeaders, [#]string
proc
outputBuffer = new StringBuilder()
jsonWriter = Utf8JsonWriter.CreateUtf8JsonWriter(outputBuffer)
jsonWriter.WriteStartObject()
jsonWriter.WriteString("name", "John Doe")
jsonWriter.WriteNumber("age", 30)
jsonWriter.WriteBoolean("isMember", true)
jsonWriter.WriteEndObject()
jsonWriter.Flush()
request = outputBuffer.ToString()
requestHeaders = new string[#] { "Content-Type: application/json" }
status = %http_post("https://httpbin.org/anything",5,request, response,errtxt,requestHeaders,responseHeaders,,,,,,,"1.0")
Console.WriteLine(response)
endmain
The response should look something like this:
{
"args": {},
"data": "{\"name\":\"John Doe\",\"age\":30,\"isMember\":true}",
"files": {},
"form": {},
"headers": {
"Content-Length": "44",
"Content-Type": "application/json",
"Host": "httpbin.org",
"X-Amzn-Trace-Id": "Root=1-657937b8-492f8fc147c17845481295b9"
},
"json": {
"age": 30,
"isMember": true,
"name": "John Doe"
},
"method": "POST",
"origin": "10.1.1.1",
"url": "https://httpbin.org/anything"
}
There are a few things to note here. First, the data
property is a string representation of the JSON data we sent. Second, the json
property is a JSON object representation of the JSON data we sent. Third, the headers
property contains the headers we sent with our request. Finally, the method
property contains the HTTP method we used. It would be an interesting exercise to parse this response and extract something useful using what you’ve learned so far about JSON parsing in DBL. When you’re done with that, it’s time to up our data structure game and learn about complex types.
Complex Types
Data can be broadly categorized into simple types (like integers, booleans, and decimal numbers) and complex types. Complex types, including classes, structures, and interfaces, are present in most modern programming languages. They allow developers to design more human-readable, organized, and modular code by representing more intricate data structures or behaviors.
The very notion of creating complex types stems from the need to manage and abstract the inherent complexity in software systems. By using classes, structures, and interfaces, developers can create modular, reusable components. These components can be pieced together in various configurations to build higher-level functionalities or to represent intricate data models.
The essence of using these complex types lies in their composable nature. As systems grow and requirements evolve, the ability to break down and reconstruct components becomes indispensable. Complex types, designed with composability in mind, provide the building blocks that make such modularity possible, enabling developers to tackle intricate problems with clarity and efficiency.
Repository
A data dictionary
A data dictionary is a centralized repository of information about data, detailing the description, structure, relationships, usage, and regulations surrounding it. Think of it as a “metadata container,” providing developers, database administrators, and other key participants a comprehensive overview of data items in a system. Synergy/DE Repository is the DBL version of a data dictionary.
A typical repository contains entries for each data element or database object and may include its name, type, permissible values, default values, constraints, and description. It may also include details about primary keys, foreign keys, indexes, and relationships between structures.
Beyond mere documentation, a well-maintained repository promotes consistency across large projects. By offering a standardized definition of each data item, it ensures that all stakeholders have a unified understanding, which is especially crucial in large teams or when a codebase has outlasted multiple generations of developers. For instance, when a developer refers to a “customer ID,” the repository can provide clarity on its format, whether it’s alphanumeric, its length, any constraints, and perhaps even its history or any business rules associated with it.
Moreover, as software systems evolve and scale, the repository evolves with them. When new data elements are introduced or existing ones undergo changes, the repository can be updated accordingly. This dynamic nature makes it an invaluable tool for data governance and auditing, helping organizations trace how data definitions have changed over time.
Code generation with Repository
CodeGen is a powerful code generator designed for DBL. Rather than writing repetitive, boilerplate code by hand, developers can use CodeGen to automatically generate large portions of their application based on predefined templates and patterns. This not only speeds up the development process but also ensures that the generated code adheres to best practices and is consistent throughout the application.
Now, imagine harnessing the detailed data definitions from Repository and feeding them into CodeGen. This integration transforms the development process. With the insights from the data dictionary, CodeGen can produce code that’s tailored to the specific data structures and relationships defined in Repository. For example, if Repository defines a customer entity with specific attributes and relationships, CodeGen can automatically generate the data access layer, CRUD operations, and even user interface components for managing customer data.
Furthermore, as data definitions evolve in Repository, developers can rerun CodeGen to update the corresponding parts of the application, ensuring that the software remains aligned with the latest data schema. This iterative process reduces manual errors, enhances maintainability, and ensures that the application remains data-centric.
Defining a repository
There are two ways to build a repository: textually using the Synergy Data Language or visually using the Repository application. We will focus on the Synergy Data Language in this book because it’s more flexible and easier to maintain. However, the repository program is a great tool for visualizing the data dictionary and can be used to generate the data definition language.
Including from a repository
The .INCLUDE statement is used to include a repository structure in a program. The use of the word “structure” here is a bit misleading, because “structure” can produce a variety of data types, including structures, commons, records, and groups. Think of .INCLUDE "structure" REPOSITORY
as your language interface to the data structures stored in your data dictionary.
.INCLUDE "structure" REPOSITORY ["rpsfile_log"][, type_spec][, qualifier]
Key components:
-
Repository source:
- rpsfile_log: Refers to a logical name representing the repository main and text filenames.
- If it’s not specified, the default logical DBLDICTIONARY is used.
- If DBLDICTIONARY is undefined, the compiler resorts to using RPSMFIL and RPSTFIL.
- Best practice: Always capitalize the logical in .INCLUDE statements for accuracy and to prevent errors.
-
Inclusion of fields:
- Fields without the “Excluded by Language” flag are included by their names.
- Fields with the “Excluded by Language” flag become unnamed fields with the appropriate size. Overlay fields with this flag won’t be included.
-
Type specification (type_spec):
- Determines the data structure type, with options like COMMON, RECORD, STRUCTURE, etc.
- If unspecified, RECORD is the default.
- NORECORD results in field creation without a RECORD statement.
-
Modifiers and qualifiers:
- Depending on the .INCLUDE location, suitable modifiers can be appended.
- Access modifiers (PUBLIC, PROTECTED, PRIVATE) define the access scope.
- OPTIONAL and REQUIRED designate argument necessity.
- Directional modifiers (IN, INOUT, OUT) stipulate data flow.
-
Prefixes:
- If a repository structure has a group with a defined “Member prefix,” this prefix is added to the group’s member fields only if the “Use by compiler” flag is active.
- Using the PREFIX qualifier results in an additional prefix, compounded with any prefixes specified for group members when the “Use by compiler” flag in Repository is active.
- Prefixes are often used to avoid naming conflicts between fields in different records or groups, though this has become less of an issue with improvements to the abbreviated path mechanism.
Usage contexts:
- Enumerations: Can be included either globally or within classes/namespaces. Inclusion terminates automatically.
- Structures: Suitable for various contexts such as argument groups, class records, and global data sections. Qualifiers can be adjusted depending on the context.
Recommendations:
- Use RPSMFIL and RPSTFIL instead of DBLDICTIONARY because these environment variables are compatible with the Synergy/DE Repository.
Relationship to the Synergy Method Catalog
While the purpose of the repository is to store and organize the structure and relationships of your application’s data and to define the schema for your Synergy applications, the Synergy Method Catalog (SMC) is a tool to define and manage functions and subroutines that you want to be able to call remotely to handle your application’s logic.
The two share metadata: the repository often serves as the source for data structures that methods in the method catalog manipulate. For example, a method defined in the catalog may operate on data structures or entities defined in the repository. Together, they provide a holistic approach to building applications where data and logic are tightly integrated but separately managed.
The method catalog can directly reference repository definitions to ensure consistency between data and the methods that process it. This reduces duplication and errors in defining how data is handled.
Synergy Data Language
Synergy Data Language Rules
When using the Synergy Data Language (schema) with DBL, adhere to the following rules:
- Case sensitivity: Names, keywords, and arguments are case-insensitive, except for quoted definitions, which must be uppercase. Non-quoted data gets automatically converted to uppercase upon input.
- Data validity: If data is missing or invalid, the statement will be disregarded.
- Order of definitions: Though definitions generally have a flexible order, there are exceptions as detailed in the “Recommended definition order” section.
- Keyword constraints:
- Keywords can appear in any sequence unless otherwise stated.
- Multi-word keywords cannot extend over multiple lines.
- Some keywords require the keyword and its associated data to stay on the same line. Such instances are noted with the keyword.
- Keyword data with colons should not contain spaces.
- Negative values are not allowed for numeric keyword arguments, except where stated.
- String data:
- Strings should be wrapped in matching double (“ “) or single (’ ’) quotes and not extended over multiple lines.
- Strings can contain quotation marks, provided they differ from the enclosing marks. For instance, “Type ‘Return’ to continue” and ‘Type “Return” to continue’ are both valid.
- Data limitations: Oversized data will be shortened.
- Comments: Start a line with a semicolon to indicate a comment. Avoid comments within a Synergy Data Language statement.
Recommended definition order:
Although definitions are flexible, there are guidelines for better organization:
- Global formats.
- Enumerations.
- Templates in parent-child order.
- Structures in reference order, with their formats, fields, keys, relations, and aliases. Structures referring to another, via an implicit group or Struct data type, should be defined first.
- Files.
Processing rules for schema files:
Schema files can create a new repository or modify an existing one.
- New repositories: When generating a new repository using a schema file, if errors are detected during the process, the new repository won’t be formed, necessitating schema file corrections.
- Updating repositories: For existing repositories, the utility makes a duplicate before performing updates. Errors will result in the deletion of this copy. Once the repository is updated successfully, use the Verify Repository and Validate Repository utilities to check it.
- Loading schemas: The Load Repository Schema utility can handle both new and existing definitions. Here are some behavior differences depending on the options selected:
- For new repositories, duplicate definitions in the schema file cause error logs.
- Merging schemas into existing repositories will add new definitions. Existing definitions can either be replaced or overlaid. The “Replace” option discards the existing structure and integrates the new schema’s structure. The “Overlay” option updates existing fields from the schema file and incorporates new ones without any deletions.
Long descriptions:
Most of the definition types provided by the Synergy Data Language have an option to include a long description. Originally this was intended to be something like a comment to future users of the repository. However, it has evolved into being used by CodeGen to provide extra data when needed. Many of the long description tags are specific to Harmony Core users but can still be useful in other contexts. Here’s an example long description tag: HARMONYCORE_CUSTOM_FIELD_TYPE=type;
.
Common attributes: All of the following attributes can be applied to TEMPLATE, FIELD, and GROUP unless otherwise specified:
-
Mandatory attributes:
- name: The name of the template, field, or group. This name can have a maximum of 30 characters
- TYPE type: The item’s type.
- Valid Types: ALPHA, DECIMAL, INTEGER, DATE, TIME, USER, BOOLEAN, ENUM, STRUCT, AUTOSEQ, AUTOTIME.
- Default formats are assigned to each type, but they can be overridden with the FORMAT attribute:
- DATE - YYMMDD
- TIME - HHMM
- USER - ALPHA
- SIZE size: The item’s size.
- Defines the field’s maximum character length.
- If omitted, the size is derived from existing fields or the assigned template.
-
Optional attributes:
- PARENT template: (Only for TEMPLATE) Parent template specification.
- STORED store_format: How it’s stored. Must follow the TYPE keyword if present
- ENUM name or STRUCT name: For enumerations or structures.
- LANGUAGE
- DESCRIPTION “description”: A description of the field definition. It can have a maximum of 40 characters and must be enclosed in double or single quotation marks (“” or ‘’).
- LONG DESCRIPTION “long_desc”: Long_desc can contain 30 lines of up to 60 characters each. Each line must be enclosed in double or single quotation marks (“” or ‘’)
-
Optional presentation attributes:
- SCRIPT, REPORT: Options for viewing, with additional VIEW or NOVIEW.
- POSITION, FPOSITION, PROMPT, HELP, INFO LINE, USER TEXT, FORMAT, REPORT HEADING, ALTERNATE NAME, REPORT JUST, INPUT JUST, PAINT: Presentation and descriptive attributes.
- RADIO|CHECKBOX`: Toggle choices.
- FONT, PROMPTFONT: Font specifications.
- READONLY, DISABLED: Functional attributes.
- COLOR, HIGHLIGHT, REVERSE, BLINK, UNDERLINE: Aesthetic attributes.
- DISPLAY LENGTH, VIEW LENGTH, UPPERCASE, NODECIMAL, DECIMAL_REQUIRED: Display characteristics.
- RETAIN POSITION, DEFAULT, AUTOMATIC, NOECHO, DATE, TIME, WAIT, INPUT LENGTH, BREAK: Behavior specifications.
- REQUIRED, NEGATIVE, NULL: Value constraints.
- MATCH CASE, MATCH EXACT: Matching rules.
- SELECTION LIST, SELECTION WINDOW: Selection parameters.
- ENUMERATED, RANGE: Enumeration and range details.
- Methods like ARRIVE METHOD, LEAVE METHOD, DRILL METHOD, HYPERLINK METHOD, CHANGE METHOD, DISPLAY METHOD, EDITFMT METHOD: Toolkit field processing methods.
-
Optional SMC-specifc attributes:
- WEB
- COERCED TYPE:
- Specifies the data type for xfNetLink Java or .NET clients.
- Valid values depend on the TYPE attribute.
- Valid coerced types by TYPE:
- Decimal (without precision):
- DEFAULT, BYTE, SHORT, INT, LONG, SBYTE, USHORT, UINT, ULONG, BOOLEAN, DECIMAL, NULLABLE DECIMAL
- Decimal (with precision):
- DEFAULT, DOUBLE, FLOAT, DECIMAL, NULLABLE DECIMAL
- Integer:
- DEFAULT, BYTE, SHORT, INT, LONG, SBYTE, USHORT, UINT, ULONG, BOOLEAN
- Date/Time/User:
- If DATE format is
YYMMDD
,YYYYMMDD
,YYJJJ
, orYYYYJJJ
, or USER subtype is DATE with^CLASS^=YYYYMMDDHHMISS
, or^CLASS^=YYYYMMDDHHMISSUUUUUU
the type can be DATETIME, NULLABLE_DATETIME
- If DATE format is
-
Attribute negations: Many attributes come with a negation, often prefixed with “NO” (e.g., NODATE, NODESC). When specifying an attribute, consider whether its positive or negative form is required.
Specific to FIELD:
- TEMPLATE template: Links the field to a particular template.
Specific to GROUP:
- REFERENCE structure: References a particular structure.
- PREFIX prefix, COMPILE PREFIX: Prefix details.
- NOSIZE: Specifies if size shouldn’t be defined.
Defining a field
The FIELD statement describes a field definition. This field is associated with the enclosing structure or group. If no structure or group has been defined, the field is ignored.
FIELD name [TEMPLATE template] TYPE type SIZE size
attribute
.
.
.
Defining a group
The GROUP statement describes a group (field) definition. This group will be associated with the most enclosing structure or group. If no structure or group has been defined yet, the group is ignored.
GROUP name TYPE type [SIZE size]
attribute or field or group
.
.
.
ENDGROUP
Defining a template
TEMPLATE name [PARENT template] TYPE type SIZE size
attribute
.
.
.
Purpose A template is a set of field characteristics that can be assigned to one or more field or template definitions. Templates are useful for defining common field characteristics that are used in multiple field definitions. Templates can be nested to create a hierarchy of field characteristics. They aren’t commonly used but can be thought of as a “base class” for fields.
Defining a structure
STRUCTURE name filetype [MODIFIED date] [DESCRIPTION "description"]
[LONG DESCRIPTION "long_desc"] [USER TEXT "string"]
Despite being called a structure, this isn’t exactly like a structure in DBL. It’s the definition of a collection of fields/groups that you can .INCLUDE in DBL to define a record, group, or structure.
FILETYPE:
- Indicates the type of file the structure will be assigned to.
- Valid values:
- ASCII
- DBL ISAM
- RELATIVE
- USER DEFINED
Basic example
STRUCTURE info DBL ISAM
GROUP customer TYPE alpha
FIELD name TYPE alpha SIZE 40
GROUP office TYPE alpha SIZE 70
FIELD bldg TYPE alpha SIZE 20
GROUP address TYPE alpha SIZE 50
FIELD street TYPE alpha SIZE 40
FIELD zip TYPE decimal SIZE 10
ENDGROUP
ENDGROUP
GROUP contact TYPE alpha SIZE 90
FIELD name TYPE alpha SIZE 40
GROUP address TYPE alpha SIZE 50
FIELD street TYPE alpha SIZE 40
FIELD zip TYPE decimal SIZE 10
ENDGROUP
ENDGROUP
ENDGROUP
Defining a file
FILE name filetype "open_filename"
[DESCRIPTION "description"]
[LONG DESCRIPTION "long_desc"]
...
[ASSIGN structure [ODBC NAME name[, structure [ODBC NAME name], ...]]
Arguments:
Most of these arguments are optional and also can be negated by appending “NO” on the front of the argument. For example, NOCOMPRESS would be the negation of COMPRESS.
-
name: Name for the file definition.
- Max length: 30 characters.
-
filetype: Type of file. Valid options:
- ASCII
- DBL ISAM
- RELATIVE
- USER DEFINED
-
open_filename: Name of the actual data file with path.
- Max length: 64 characters.
- Enclosed in double or single quotation marks.
-
USER TEXT “string”: (Optional)
- User-defined text.
- Max length: 60 characters.
-
RECTYPE rectype: (Optional)
- Specifies the record type. Used exclusively for DBL ISAM filetypes.
- Valid options: FIXED (default), VARIABLE, MULTIPLE.
-
PAGE SIZE page_size: (Optional)
- Specifies the page size for DBL ISAM filetypes.
- Valid options: 512, 1024 (default), 2048, 4096, 8192, 16384, 32768.
-
DENSITY percentage: (Optional)
- Designates the key density percentage for DBL ISAM files.
- Percentage range: 50 to 100.
- Default is approximately 50%.
-
ADDRESSING addressing: (Optional)
- Determines the address length of the ISAM file for DBL ISAM file types.
- Valid options: 32BIT (default), 40BIT.
-
SIZE LIMIT size_limit: (Optional)
- Indicates the maximum megabytes allowed for the data file (.is1) for REV 6+ ISAM files.
-
RECORD LIMIT record_limit: (Optional)
- Sets the maximum record count permissible in the file for REV 6+ ISAM files.
-
TEMPORARY: (Optional)
- Describes the file definition as temporary. Excludes it from ReportWriter or xfODBC file listings.
-
COMPRESS: (Optional)
- Asserts that file data is compressed, specific to DBL ISAM files.
-
STATIC RFA: (Optional)
- Specifies fixed RFA for records across WRITE operations in DBL ISAM files.
-
TRACK CHANGES: (Optional)
- Enables change tracking in the file for REV 6+ ISAM files.
-
TERABYTE: (Optional)
- Signifies a 48-bit terabyte file for DBL ISAM filetypes.
-
STORED GRFA: (Optional)
- Commands the CRC-32 portion of an RFA to be generated and stored for every STORE or WRITE procedure in DBL ISAM files rather than generated on every read.
-
ROLLBACK: (Optional)
- Permits change tracking rollbacks for the file.
-
NETWORK ENCRYPT: (Optional)
- Mandates encryption for client access for DBL ISAM files.
-
PORTABLE integer_specs: (Optional)
- Passes non-key portable integer data arguments to the ISAMC subroutine for DBL ISAM files.
- Syntax: I=pos:len[,I=pos:len][,…].
-
FILE TEXT “file_text”: (Optional)
- Adds specified text to the header of REV 6+ ISAM files.
- Syntax:
- text_size[K]
- “text_string”
- text_size[K]:“text_string”
-
ASSIGN structure: (Optional)
- Assigns a structure to the file definition.
- Can be assigned multiple structures.
- Max length: 30 characters.
- Enclosed in double or single quotation marks.
-
ODBC NAME name: (Optional)
- Sets the table name for ODBC access, capped at 30 characters.
Discussion:
The FILE statement describes a file definition. These definitions specify files for access through the Repository and the structures used for access.
Key points:
- Only structures with matching file types can be assigned to a definition.
- Structure should have at least one field.
- For assigning multiple structures, primary keys must match.
- Key criteria includes size, sort order, duplicates flag, data type, segments, and segment details.
Defining an enumeration
Purpose: An enumeration offers a way to define a set of named values, ensuring more readable and maintainable code.
Syntax:
ENUMERATION [name] [DESCRIPTION "description"] [LONG DESCRIPTION "long_desc"]
MEMBERS [member] [value], [member] [value], ...
Arguments:
-
name: Denotes the title of the enumeration. The name can be a maximum of 30 characters in length.
-
DESCRIPTION “description”: (Optional) Provides a concise description of the enumeration. The description should be encased in either double (“ “) or single (’ ’) quotation marks and can span up to 40 characters.
-
LONG DESCRIPTION “long_desc”: (Optional) Supplies a comprehensive description, providing more information about the enumeration and its application. This can comprise up to 30 lines, with each line not exceeding 60 characters. Every line should be enveloped in either double or single quotation marks.
-
MEMBERS member: Details the members of the enumeration. Each member’s name can be up to 30 characters long. At least one member is mandatory for an enumeration. If there are multiple members, they should be comma-separated.
- value (Optional): Affiliates a numeric value with the member. When used, this value should appear on the same line as its corresponding member.
Description:
ENUMERATION provides a mechanism for developers to create a set of named constants, enhancing code clarity. This is an analog to the enum provided in DBL but with the benefits of a repository definition.
For instance, one might define an enumeration for days of the week. Instead of working directly with numeric values (which could lead to errors and make the code harder to interpret), developers can use the named constants of the enumeration.
Defining an alias
Purpose: An alias provides an alternate naming convention for a structure or field in the Synergy Data Language.
Syntax:
ALIAS [alias] [type] [name]
Arguments:
-
alias: The name for the alias. This name can be up to 30 characters long.
-
type: Specifies the alias type. Acceptable values include STRUCTURE and FIELD.
-
name: Represents the original name of the structure or field being aliased. This too can have a length of up to 30 characters.
Description:
The ALIAS statement in Synergy Data Language provides a mechanism to give an alternate name to either a structure or a field. This is particularly beneficial in scenarios like the following:
-
Application conversion: When transitioning an application to use the Repository and there’s a preference for lengthier names in contrast to the shorter, more cryptic ones traditionally used. Aliases can act as an intermediary, facilitating smoother code updates in Synergy.
-
Structfield definition: When there’s a need to define structfields and simultaneously require that the repository structure be a record in the Synergy code. Here, an alias can be formed and used in the structfield definition.
When invoking the .INCLUDE directive to reference a repository in Synergy code, either the original name or its alias can be used. Initially, the compiler will attempt to find a structure or field with the name mentioned in the .INCLUDE command. In the absence of such a name, it looks for an alias. Consequently, all names (whether original or alias) must be distinct. This rule applies for both structures and fields, with the caveat that field names—whether original or alias—need to be unique within a specific structure.
Positioning:
In the Synergy Data Language file, an alias should be situated within the structure it points to, known as the “aliased structure.” It’s permissible to assign multiple aliases to one structure. Similarly, within an aliased structure, one can map numerous aliases to a singular field.
A field with an alias aligns with the last aliased structure defined. In situations where no aliased structure is present, any aliased fields are disregarded. Also, it’s important to note that fields defined within a group cannot be aliased.
Structures
A STRUCTURE statement provides a clear and organized way to manage data in your programs. Unlike the RECORD or GROUP declarations, which allocate space for a specific, non-reusable grouping of variables, a STRUCTURE statement serves to define a blueprint for a data type. This blueprint can be used multiple times across your program, showcasing the advantages of using types in programming.
In contrast, a RECORD or GROUP creates a collection of variables in a fixed arrangement. While useful in certain contexts, this is a one-off assembly of data. Once defined, a RECORD can’t be replicated, passed around, or instantiated, making it less flexible in larger programs.
On the other hand, a STRUCTURE is more akin to a type definition, acting as a template for creating instances of structured data. These instances can then be used, passed around, or returned from routines, offering a level of flexibility beyond what records provide.
Similar to a RECORD or GROUP, you can treat any STRUCTURE that contains only fixed-size types as an alpha. This allows you to write them directly to disk or send them directly over the network without any serialization effort. This doesn’t work if the structure contains an object or another structure that can’t be converted to an alpha.
Structures can be declared globally, within a namespace, class, or routine. If you pass a structure to a routine, the compiler knows precisely what type of data it is and can ensure that you’re using the structure correctly, such as accessing the right fields or using appropriate operations. Therefore, using structures leads to safer, more reliable, and easier-to-maintain code, which is why they’re recommended over groups and named records.
Defining Structures
> [structure_mod] STRUCTURE name
> member_def
> .
> .
> .
> [ENDSTRUCTURE]
The above syntax for definition structures is a slightly abbreviated version. There are other features, but since they have their own chapters, we’ll just cover the common parts here.
The structure_mod
preceding STRUCTURE is an optional modifier that provides additional information about the structure. Some of the available structure_mod
options are
-
PARTIAL: Partial allows the definition of a structure to be split into multiple files. It is often used in code generation scenarios. Many code generators produce partial structures, allowing developers to extend the generated code with custom logic in a separate file, without the risk of overwriting it when the generated code is updated.
-
BYREF: This modifier is for .NET only. It tells the compiler that instances of the structure can only be allocated on the stack, not on the heap. This has implications for memory management and the structure’s lifecycle, as stack-allocated structures have a deterministic, scoped lifetime that ends when the execution leaves the scope in which they were created. This contrasts with heap-allocated objects, which have their lifetimes managed by the garbage collector. BYREF structures also come with usage restrictions and cannot be used in certain contexts like generics, boxed objects, lambdas, YIELD iterators, and ASYNC methods. Despite these restrictions, BYREF structures can be particularly efficient in performance-critical parts of your code, where avoiding garbage collection overhead is important.
-
CLS: This modifier is for .NET only. Applying the CLS modifier to a structure signals to the compiler that only CLS-compliant types will be contained in that structure. Although this restricts the type of fields that can be included in the structure, it notably broadens the ways you can manipulate instances of that structure. The implications are substantial, especially when dealing with generics, which we’ll explore in more depth in an upcoming chapter on that topic.
-
READONLY: This modifier is for .NET only. It tells the compiler that instances of this structure are immutable, meaning their state cannot be changed after they are created. Immutability is a powerful concept in programming as it simplifies code by eliminating side effects and making it easier to reason about the behavior of the code. READONLY structures also have performance benefits, especially when combined with BYREF. The .NET JIT can make certain optimizations, knowing that a method cannot modify the state of a READONLY structure. This makes READONLY structures an attractive choice for high-performance .NET code, where minimizing copies of value types is essential.
A member_def
inside a structure is either a single field definition or a group declaration. A field definition follows the syntax specified in “Defining a field,” while a group declaration adheres to the GROUP-ENDGROUP syntax. There is no limit to the number of member definitions you can specify within a structure.
Quiz Answers
Collections
Creating efficient, effective code requires the ability to organize data. One way to manage and manipulate groups of data is through collections, which are data structures (ordered groups of objects) that enable you to store, access, and modify multiple pieces of data as a single unit. Some underlying collection types that we address in this chapter are arrays, arraylists, and dictionaries.
The unique characteristics of these collection types make them most suitable for specific kinds of tasks:
-
Arrays provide a straightforward, fixed-size structure to manage ordered data.
-
Arraylists offer more flexibility than arrays by allowing dynamic resizing while maintaining order, making them especially useful when the size of the data set is uncertain.
-
Dictionaries enable you to store and retrieve data using unique keys, allowing for fast lookups based on meaningful identifiers rather than just numerical indexes.
ArrayList
System.Collections.ArrayList and Synergex.SynergyDE.Collections.ArrayList are both implementations of a dynamically sized array of objects, with the Synergex.SynergyDE.Collections version existing to provide a 1-based index for compatibility with traditional DBL. Unless you have a specific need to use 1-based indexing, it’s best to use the System.Collections version. If you’re using .NET exclusively, I recommend using generics and the System.Collections.Generic.List
Everything is an object
When compared to an Array or a generic type like List
structure simple_structure
somefield, a10
someother, d20
endstructure
main
record
myArrayList, @System.Collections.ArrayList
myStructure, simple_structure
myInt, i4
myString, @string
proc
myArrayList = new System.Collections.ArrayList()
myArrayList.Add((@a)"hello a") ;myArrayList[0]
myArrayList.Add((@d)5);myArrayList[1]
myArrayList.Add((@string)"hello string") ;myArrayList[2]
myArrayList.Add((@int)5);myArrayList[3]
;;you can put anything into an array list, even another array list!
myArrayList.Add(new System.Collections.ArrayList()) ;myArrayList[4]
myStructure.somefield = "first"
myStructure.someother = 1
;;boxes a copy of myStructure
myArrayList.Add((@simple_structure)myStructure) ;myArrayList[5]
;;we can change myStructure now without impacting the first boxed copy
myStructure.somefield = "second"
myStructure.someother = 2
;;boxes another copy of myStructure
myArrayList.Add((@simple_structure)myStructure) ;myArrayList[6]
;;lets get the values back out again
;;all objects implement ToString, its not always useful
;;but we can see what comes out
Console.WriteLine(myArrayList[0].ToString())
Console.WriteLine(myArrayList[1].ToString())
Console.WriteLine(myArrayList[2].ToString())
Console.WriteLine(myArrayList[3].ToString())
Console.WriteLine(myArrayList[4].ToString())
Console.WriteLine(myArrayList[5].ToString())
Console.WriteLine(myArrayList[6].ToString())
;;lets get a few objects back to their original type
myInt = (@i4)myArrayList[3]
Console.WriteLine(%string(myInt))
myString = (@string)myArrayList[2]
Console.WriteLine(myString)
myStructure = (@simple_structure)myArrayList[5]
Console.WriteLine(myStructure.somefield) ;this should contain "first" not "second"
;;this won't work because we haven't correctly matched the types
myInt = (@i4)myArrayList[0] ;;this is actually a boxed alpha
endmain
Output
hello a 5 hello string 5 SYSTEM.COLLECTIONS.ARRAYLIST first 00000000000000000001 second 00000000000000000002 5 hello string first %DBR-E-INVCAST, Invalid cast operation Class <SYSTEM.TYPE_I> is not an ancestor of <SYSTEM.TYPE_A>
Finding things
The IndexOf and LastIndexOf methods in the ArrayList class provide the ability to locate the position of an object within the list. The IndexOf method searches from the beginning of the ArrayList and returns the 0-based index of the first occurrence of the specified object, while LastIndexOf starts the search from the end and returns the index of the last occurrence. Both methods return -1 if the object is not found. However, it’s important to note that these methods rely on the default comparer of the object type stored in the ArrayList. If the objects in the list do not provide a meaningful implementation of the Equals method, then the comparison is based on reference equality, meaning that it checks if the references point to the same memory location, not if their contents are the same. This can be a limitation when working with complex objects or objects from classes that haven’t overridden the Equals method to provide value-based comparison. To address this limitation, it’s often necessary to implement a custom equality comparison by overriding the Equals method in the object’s class, or by using a more type-safe collection like List
record
myArrayList, @System.Collections.ArrayList
myStructure, simple_structure
proc
myArrayList = new System.Collections.ArrayList()
myArrayList.Add((@string)"hello 1") ;myArrayList[0]
myArrayList.Add((@string)"hello 2") ;myArrayList[1]
myArrayList.Add((@string)"hello 2") ;myArrayList[2]
myArrayList.Add((@string)"hello 3") ;myArrayList[3]
Console.WriteLine(%string(myArrayList.IndexOf((@string)"hello 2")))
Console.WriteLine(%string(myArrayList.LastIndexOf((@string)"hello 2")))
myStructure.somefield = "first"
myStructure.someother = 1
;;boxes a copy of myStructure
myArrayList.Add((@simple_structure)myStructure) ;myArrayList[4]
myStructure.somefield = "second"
myStructure.someother = 2
;;boxes another copy of myStructure
myArrayList.Add((@simple_structure)myStructure) ;myArrayList[5]
;;****WARNING****
;;returns 5 in traditional DBL, returns -1 in .NET
;;the implementation in .NET does not provide "structural equality" for Synergy structures
;;this is because the DBL compiler does not automatically generate an
;;overridden Equals method for "simple_structure"
;;Traditional DBL happens to treat structures and alphas the same and so gains
;;the ability to check for structural equality
Console.WriteLine(%string(myArrayList.IndexOf((@simple_structure)myStructure)))
Traditional output
1 2 5
.NET output
1 2 -1
Arrays
Real arrays and pseudo arrays
Fixed size arrays can either be “real” or “pseudo.” While pseudo arrays are deprecated and not recommended for use, understanding their properties and limitations is still beneficial. It’s important to note that, unlike many other programming languages, DBL uses a one-based indexing system for arrays, meaning that array elements start at index 1.
Real arrays
When declaring a real array, each declared dimension must be specified. For instance, consider a variable declaration where a record named brk
is a 3x4 array of type d1.
brk ,[3,4]d1
In this case, brk[1,2]
and brk[2,2]
are valid references as each dimension is specified. However, brk[1]
is not valid and will trigger a compiler error, “Error DBL-E-NFND: %DBL-E-NFND, brk[D] not found : Console.WriteLine(brk[1]),” because it specifies only one of the two dimensions.
For purely historical reasons brk[ ]
is valid and refers to the entire scope, or contents, of the dimensioned array as a single element. The maximum size of a scope reference is 65,535. If the array is larger than this, the scope size is modulo 65,535.
record demo
alpha ,[3,2]d2 , 12 ,34,
& 56 ,78,
& 98 ,76
beta ,[2,4]a3, "JOE" ,"JIM" ,"TED" ,"SAM",
& "LOU" ,"NED" ,"BOB" ,"DAN"
proc
Console.WriteLine(demo.alpha[1,2])
Console.WriteLine(demo.beta[2,1])
Console.WriteLine(demo.beta[])
Output
34 LOU JOEJIMTEDSAMLOUNEDBOBDAN
Pseudo arrays
A pseudo array is a one-dimensional array of type a, d, d., i1, i2, i4, i8, p, or p.. A real array can consist of those same types, as well as structure, but can be defined with multiple dimensions.
In Traditional DBL, if a pseudo array is used in a class, it is converted to a real array by the compiler. A pseudo array is also converted to a real array if it is used outside a class and the code is compiled with -qcheck. If -W4 is set, the compiler reports a level 4 warning regarding this change.
In DBL running on .NET, pseudo arrays are treated as real arrays by the compiler. As a result, you must use the (*) syntax to pass a pseudo array as an argument.
It’s worth noting that you cannot declare a real array of .NET value types or objects or of CLS structures. For example, declarations like 10 int
or 10 String
are not permitted. Instead, use an array of i4 or a dynamic array, such as [#]int
or [#]String
.
Finally, to get the length of an entire static array, use ^SIZE on the array variable with empty brackets:
record
fld, 10i4, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
proc
Console.WriteLine(^size(fld[]))
Console.WriteLine(^size(fld[1]))
Console.WriteLine(fld[1])
Output
40 4 1
Dynamic arrays
Dynamic arrays provide a structure for storing multiple values of the same type, with the ability to resize during program execution. They are declared using the hash symbol in square brackets ([#]), which denotes a dynamic array.
Dynamic arrays are not limited to storing simple data types such as integers or characters; they can also store complex data types like structures or classes.
Regardless of what platform you’re running on, dynamic arrays support various operations for data manipulation and retrieval. These include methods like Clear()
, IndexOf()
, LastIndexOf()
, and Copy()
, which clear the contents, find an element, find the last matching element, and copy a range of elements, respectively. The Length
property can be used to get the current size of the array. Dynamic arrays can be iterated over one element at a time using a FOREACH loop. When running on .NET, you have access to additional properties and methods provided by the .NET Base Class Library.
Using dynamic arrays in DBL can make your code more flexible and efficient, as it allows you to handle varying data quantities without the need for manual memory management using ^M or huge static arrays.
Here are a few examples showing how to declare and initialize dynamic arrays in DBL:
record
myStringArray, [#]String
anotherStringArray, [#]String
proc
;;create a new array with 2 values
myStringArray = new String[#] { "first value", "second value"}
;;replace the second element
myStringArray[2] = "a different second value"
Console.WriteLine(myStringArray[2])
;;allocate a new array and Copy the elements from myStringArray
anotherStringArray = new String[2]
Array.Copy(myStringArray, 1, anotherStringArray, 1, 2)
Console.WriteLine(anotherStringArray[1])
Output
a different second value first value
Dictionary
There is nothing built into Traditional DBL that behaves exactly like Dictionary in .NET. However, DBL contains a few of the necessary building blocks, so we’re going to use this as an opportunity to build our own Dictionary class. This will be a good exercise in using some of the collections and concepts we’ve covered so far.
What’s this useful for?
Depending on your historical context, you might be wondering, why should I care about a dictionary when I can just use an ISAM file? Alternatively, you might be wondering why there isn’t much of a built-in in-memory associative lookup data structure. I’ll start by trying to sell you on the benefits of an in-memory dictionary.
Using in-memory dictionaries offers several benefits:
Speed: Accessing and modifying data in memory is orders of magnitude faster than disk operations.
Efficiency: In-memory operations reduce the overhead of disk I/O, making data processing more efficient.
Simplicity: Working with data in memory often simplifies the code, reducing the complexity associated with file management.
Serialization: There's no need to serialize and deserialize data when it's already in memory. More importantly, there's no need to worry about data structures like string that can't be written directly to ISAM files.
Now it’s time to discuss the downsides, and these are probably why in-memory dictionaries aren’t used much in traditional DBL code. DBL programs often handle large volumes of data, and while the amount of RAM installed on your production servers may have grown significantly over the last 30 years, there are still some operations where you should work in on-disk structures like a temporary ISAM file.
That said, there are still plenty of scenarios where an in-memory dictionary is a good fit. For example, if you need to perform a series of lookups on a small set of data, it’s often more efficient to load the data into memory and perform the lookups there, rather than repeatedly accessing the disk. This is especially true if the data is already in memory, such as when it’s being passed from one routine to another. In such cases, using an in-memory dictionary can be a good option. As with all things architecture and performance related, your mileage may vary, and you should always test your assumptions.
Implementation
Let’s jump into a high-level overview for our custom implementation of a dictionary-like data structure, combining the Symbol Table API with System.Collections.ArrayList to manage string lookups of arbitrary objects.
Overview
- Purpose: To create a dictionary for string-based key lookups.
- Key components:
- Symbol Table API: For handling key-based lookups.
- System.Collections.ArrayList: For storing objects.
- Operations Supported: Add, Find, Delete, and Clear entries.
Class structure
StringDictionary
class
- Purpose: Acts as the main dictionary class.
- Key components:
symbolTableId
: The identifier for the symbol table.objectStore
: An ArrayList to store objects.freeIndices
: An ArrayList to manage free indices inobjectStore
.
KeyValuePair
inner class
- Purpose: Represents a key-value pair.
- Components:
Key
: The key (string).Value
: The value (object).
Constructor: StringDictionary()
- Initializes
objectStore
andfreeIndices
. - Calls
nspc_open
to create a symbol table with specific flags. - Flags used:
D_NSPC_SPACE
: Leading and trailing spaces in entry names are significant.D_NSPC_CASE
: Case sensitivity for entry names.
Destructor: ~StringDictionary()
- Closes the symbol table using
nspc_close
.
Methods overview
-
Add method:
- Adds a new key-value pair to the dictionary.
- Checks for duplicate keys using
nspc_find
. - If no duplicate, adds the key-value pair using
nspc_add
.
-
TryGet method:
- Tries to get the value for a given key.
- Uses
nspc_find
to locate the key. - If found, retrieves the value from
objectStore
.
-
Get method:
- Retrieves the value for a given key.
- Similar to
TryGet
but throws an exception if the key is not found.
-
Set method:
- Sets or updates the value for a given key.
- If the key exists, updates the value.
- If the key doesn’t exist, adds a new key-value pair.
-
Remove method:
- Removes a key-value pair from the dictionary.
- Uses
nspc_find
to locate the key. - Deletes the entry using
nspc_delete
.
-
Contains method:
- Checks if a key exists in the dictionary.
-
Clear Method:
- Clears the dictionary.
- Uses
nspc_reset
to clear the symbol table.
-
Items Method:
- Returns a collection of all key-value pairs in the dictionary.
Internal Methods
-
AddObjectInternal:
- Manages adding objects to the
objectStore
. - Uses
freeIndices
to reuse free slots inobjectStore
.
- Manages adding objects to the
-
RemoveObjectInternal:
- Manages removing objects from
objectStore
. - Adds the index to
freeIndices
.
- Manages removing objects from
Symbol Table API integration
- The Symbol Table API (%NSPC_ADD, %NSPC_FIND, %NSPC_DELETE, etc.) is used for managing keys in the dictionary.
objectStore
holds the actual objects, while the symbol table keeps track of the keys and their corresponding indices inobjectStore
.
Error handling
- The class includes error handling for situations like duplicate keys or keys not found.
Usage
- This
StringDictionary
class can be used for efficient key-value pair storage and retrieval, especially useful in scenarios where the keys are strings and the values are objects of arbitrary types.
import System.Collections
.include 'DBLDIR:namspc.def'
namespace DBLBook.Collections
public class StringDictionary
public class KeyValuePair
public method KeyValuePair
key, @string
value, @object
proc
this.Key = key
this.Value = value
endmethod
public Key, @string
public Value, @Object
endclass
private symbolTableId, i4
private objectStore, @ArrayList
private freeIndices, @ArrayList
public method StringDictionary
proc
objectStore = new ArrayList()
freeIndices = new ArrayList()
symbolTableId = nspc_open(D_NSPC_SPACE | D_NSPC_CASE, 4)
endmethod
method ~StringDictionary
proc
xcall nspc_close(symbolTableId)
endmethod
private method AddObjectInternal, i4
value, @object
proc
if(freeIndices.Count > 0) then
begin
data freeIndex = (i4)freeIndices[freeIndices.Count - 1]
freeIndices.RemoveAt(freeIndices.Count - 1)
objectStore[freeIndex] = value
mreturn freeIndex
end
else
mreturn objectStore.Add(value)
endmethod
private method RemoveObjectInternal, void
index, i4
proc
freeIndices.Add((@i4)index)
;;can't just call removeAt because it would throw off all of the objects that are stored after it
;;so we just add to a free list and manage the slots that way
objectStore[index] = ^null
endmethod
public method Add, void
req in key, @string
req in value, @object
record
existingId, i4
newObjectIndex, i4
proc
if(nspc_find(symbolTableId, key,, existingId) == 0) then
begin
newObjectIndex = AddObjectInternal(new KeyValuePair(key, value))
nspc_add(symbolTableId, key, newObjectIndex)
end
else
throw new Exception("duplicate key")
endmethod
public method TryGet, boolean
req in key, @string
req out value, @object
record
objectIndex, i4
kvp, @object
proc
if(nspc_find(symbolTableId, key,objectIndex) != 0) then
begin
kvp = objectStore[objectIndex]
value = ((@KeyValuePair)kvp).Value
mreturn true
end
else
begin
value = ^null
mreturn false
end
endmethod
public method Get, @object
req in key, @string
record
objectIndex, i4
kvp, @Object
proc
if(nspc_find(symbolTableId, key,objectIndex) != 0) then
begin
kvp = objectStore[objectIndex]
mreturn ((@KeyValuePair)kvp).Value
end
else
throw new Exception("index not found")
endmethod
public method Set, void
req in key, @string
req in value, @object
record
objectIndex, i4
proc
if(nspc_find(symbolTableId, key,objectIndex) != 0) then
begin
objectStore[objectIndex] = new KeyValuePair(key, value)
end
else
Add(key, value)
endmethod
public method Remove, void
req in key, @string
record
objectAccesCode, i4
objectIndex, i4
proc
if((objectAccesCode=%nspc_find(symbolTableId,key,objectIndex)) != 0)
begin
nspc_delete(symbolTableId, objectAccesCode)
RemoveObjectInternal(objectIndex)
end
endmethod
public method Contains, boolean
req in key, @string
proc
mreturn (nspc_find(symbolTableId, key) != 0)
endmethod
public method Clear, void
proc
nspc_reset(symbolTableId)
freeIndices.Clear()
objectStore.Clear()
endmethod
public method Items, [#]@StringDictionary.KeyValuePair
record
itm, @StringDictionary.KeyValuePair
result, [#]@StringDictionary.KeyValuePair
itemCount, int
i, int
proc
itemCount = 0
foreach itm in objectStore
begin
if(itm != ^null)
incr itemCount
end
result = new KeyValuePair[itemCount]
i = 1
foreach itm in objectStore
begin
if(itm != ^null)
begin
result[i] = itm
incr i
end
end
mreturn result
endmethod
endclass
endnamespace
Program Organization
DBL encourages structured, modular programming design, which supports code reusability and separation of concerns and enables you to take advantage of a distributed processing environment. Modular code is contained in an isolated functional unit with a well-defined, published interface. In this chapter, we’ll take a look at some of the things that improve code organization:
-
Namespaces, which provide logical groupings for code elements with related functionality, helping to avoid naming conflicts in a project, especially when working with multiple libraries or developers. They make code easier to read and maintain by clearly defining where in the codebase identifiers belong.
-
Projects, which organize code into executable units, each representing a specific area or functionality. Projects collect all files, settings (compiler settings, environment variables, etc.), data connections, and references needed to build a Synergy program or library. They then supply this information to MSBuild at build time, and they enable Visual Studio’s IntelliSense features to work for Synergy DBL files.
-
Libraries, which group related classes and methods into reusable components, making it easier to maintain and share code. DBL has different types of libraries depending on what you’re doing and what tools you’re using. In the Libraries section of this book, we discuss how to create and link libraries manually for Traditional DBL, how to create libraries with MSBuild for Traditional DBL, and how to create libraries for Synergy .NET.
-
Robust prototype validation during compilation, which ensures that the compiler checks each routine against its prototype to verify the correct number and types of arguments, the proper return type, and other specifications. For Synergy .NET code, strong prototyping is always enforced. In Traditional Synergy object-oriented code, strong prototyping is mandatory and occurs within a single compilation unit. However, when working across multiple compilation units—such as when using multiple dbl commands—you need to generate prototypes using the Synergy Prototype utility (dblproto) to enable validation.
A structured approach ensures that your program is organized in a way that enhances scalability, maintainability, and extensibility, allowing team members to work in parallel while minimizing dependencies.
Namespaces
A namespace is essentially a container that allows developers to bundle up a set of related functionalities, classes, or structures under a unique identifier. This concept ensures that similarly named entities do not collide, which is particularly beneficial as software projects grow in size and complexity.
Rationale: The primary reason for using namespaces is to prevent naming conflicts. As software projects expand, the probability increases of having multiple developers or teams, perhaps working on different libraries or modules, inadvertently using the same name for a function, class, or variable. Such conflicts can lead to ambiguous references, making it unclear which entity is being referred to, thereby leading to potential errors and unpredictable behavior. Namespaces mitigate this problem by providing a clear and defined scope, ensuring that even if two entities share the same name, as long as they reside in different namespaces, they will remain distinct and won’t conflict.
Moreover, namespaces assist in code organization. By segregating functionalities into different namespaces, it becomes much easier to understand the modular structure of a project, track dependencies, and maintain the code. Developers get a clearer perspective on where to locate specific functionalities and how different components of a system relate to each other.
Mechanics: Defining a namespace is straightforward. Once a namespace is declared, all subsequent types, classes, and methods reside under that namespace until it’s explicitly closed or a new namespace is declared. For instance:
namespace MyNamespace
class MyClass
endclass
endnamespace
The class MyClass
is now under the MyNamespace
namespace and can be referred to specifically as MyNamespace.MyClass
.
Import syntax: When you want to utilize elements (like classes or methods) from a namespace in another part of your code, you can do so using the IMPORT statement. This means you won’t need to provide the full namespace path every time you reference an entity.
For example, to use the previously mentioned MyClass
without specifying its namespace every time, you’d write
import MyNamespace
After this, MyClass
can be used directly in the code without the MyNamespace.
prefix.
IMPORT statements should appear at the top of the file. This “rule” isn’t enforced by the compiler, but because the behavior is the same regardless of where the IMPORT statement is located, the behavior would be surprising if it were placed in the middle of a file.
Nested namespaces: Namespaces can be nested inside one other. For instance, you can have a namespace MyNamespace
that contains another namespace MyNamespace.MySubNamespace
. This is useful for organizing related functionalities into a hierarchy. You can do this in a single statement or multiple nested statements. For example:
namespace MyNamespace.MySubNamespace
class MyClass
endclass
endnamespace
;;this is equivalent to the above
namespace MyNamespace
namespace MySubNamespace
class MyClass
endclass
endnamespace
endnamespace
Caveats: In Traditional DBL, namespaces cannot effectively contain functions and subroutines. If you defined a function or subroutine inside a namespace, it would be global and not contained within the namespace. This is not the case in .NET. In .NET, you can define functions and subroutines inside a namespace and they will be contained within the namespace. This unfortunate internal limitation is caused by the way functions and subroutines are found at runtime inside loaded ELBs.
Projects
While scripted builds can offer a sense of straightforward control and can give the impression of quicker compile times for small changes, there are several compelling reasons why programmers should consider using structured build systems like MSBuild.
Consistency and standardization
Structured approach: MSBuild provides a consistent and standardized approach to building projects. It ensures that every build follows the same steps and rules, reducing the chances of discrepancies that can often arise with custom scripts.
Team collaboration: In a team environment, standardization is crucial. MSBuild allows different team members to work on the same project with a unified understanding of the build process, minimizing conflicts and confusion that can arise from individualized scripts.
Complexity management
Handling large projects: As projects grow in size and complexity, maintaining custom scripts for building can become increasingly cumbersome and error-prone. MSBuild is designed to handle complex dependency trees and project structures efficiently, something that’s hard to maintain manually in scripts.
Automated dependency tracking: MSBuild automatically handles dependencies between files. This means that it intelligently recompiles only what is necessary, reducing the manual effort of tracking changed files, a process that is prone to human error. Going a step further, DBL actually checks the external signatures of your dependencies to prevent a small change to a core library turning into a full rebuild of your entire codebase.
Integration with tools and ecosystems
IDE integration: MSBuild is tightly integrated with Visual Studio and other development tools. This integration provides developers with seamless experiences, such as detailed build diagnostics, easy configuration management, and immediate feedback on build errors and warnings.
Ecosystem compatibility: Using MSBuild ensures compatibility with a wide range of tools and plugins in the .NET ecosystem, including continuous integration systems.
Advanced features and flexibility
Customization and extensibility: While MSBuild provides a structured approach, it also offers extensive customization and extensibility options. Developers can define custom build steps, specify conditional builds, and integrate other tools as needed, all within a structured framework.
Cross-platform builds: Builds produced by MSBuild can be executed on Windows or Linux for both the Traditional runtime and the .NET runtime. However, for DBL code, your MSBuild currently still needs to be run on Windows.
Maintainability and future-proofing
Easier maintenance: A structured build system is generally easier to maintain and update. Changes in the build process or project structure can be implemented more systematically, without the need to rewrite scripts from scratch.
Community and support: MSBuild, being a widely used and Microsoft-supported build system, benefits from a large community, regular updates, and professional support. This status ensures that the build system remains up-to-date with the latest technology trends and best practices.
Introduction to MSBuild’s XML format
MSBuild projects are defined using XML. At the heart of an MSBuild file, with a .synproj
extension for DBL projects, are various elements that describe how to build a project. For DBL, you might have custom source file extensions, but the structure remains consistent with MSBuild standards.
Older MSBuild-style projects
Traditional DBL and DBL targeting the .NET Framework is currently built using the older MSBuild-style projects.
Verbose XML schema: Traditional MSBuild project files are known for their verbosity. They contain detailed specifications for every file in the project, along with numerous property and target definitions. This verbosity often made the project files large and cumbersome to edit and maintain.
Package management: In older MSBuild projects, NuGet package references were typically managed in separate packages.config
files. This approach required additional synchronization between the package configuration and the project file.
Framework targeting: Targeting multiple frameworks required more manual setup. Developers had to carefully manage conditional statements within the project file to accommodate different frameworks, making the process error-prone and complex.
Build process customization: Customizing the build process involved manually editing the project file to include various MSBuild tasks and targets. This required a deep understanding of MSBuild’s inner workings.
SDK-style projects
With the introduction of .NET Core, SDK-style projects became the standard. These projects are designed to be simpler, more concise, and easier to work with. When targeting .NET with DBL, you’ll be using SDK-style projects.
Simplified and lean structure: SDK-style project files are much leaner and more readable. They use a simplified XML schema and often require only a minimal set of elements to work. Files are included implicitly, so there’s no need to list each file individually.
Integrated package management: SDK-style projects integrate NuGet package references directly within the project file using the PackageReference
node. This eliminates the need for packages.config
and simplifies the management of dependencies.
Multi-targeting simplified: SDK-style projects make it easier to target multiple frameworks. Developers can specify multiple target frameworks in a single property (TargetFrameworks
), greatly simplifying the process.
Cross-platform and modern tooling: These projects are designed with cross-platform support in mind and are built to work seamlessly with modern tools like the .NET CLI. This makes them more adaptable to different environments and toolchains.
Enhanced project Sdk attribute: The Sdk
attribute in the project file header specifies which SDK will be used (e.g., Microsoft.NET.Sdk for .NET Core projects). This attribute abstracts much of the complexity and allows the project file to focus on the specifics of the project itself.
Knowing that there is a difference between the project styles, we’re going to try to explain the common elements that don’t really change between the two styles.
Selecting the output type
The output type of a project is specified within the <PropertyGroup>
element. For a DBL application, you might be building a console app or a library. This is specified using the <OutputType>
tag. For example:
<PropertyGroup>
<OutputType>Exe</OutputType>
<!-- TODO: Other properties -->
</PropertyGroup>
This snippet sets the output type to an executable. The following table lists the various output types available for DBL projects:
Referencing other projects
To reference other projects, such as libraries or dependencies, use the <ItemGroup>
element with <ProjectReference>
tags. Each reference includes the path to the other project file:
<ItemGroup>
<ProjectReference Include="..\Library\MyLibrary.synproj" />
<!-- TODO: Additional project references -->
</ItemGroup>
This structure allows your DBL project to integrate and use functionalities from other projects within your solution.
Adding source files
Source files are included in the project through the <ItemGroup>
element, using the <Compile>
tag. For DBL, you would specify each source file (.dbl
) you want to include:
<ItemGroup>
<Compile Include="src\MyProgram.dbl" />
<!-- TODO: Other source files -->
</ItemGroup>
This ensures that MSBuild recognizes and compiles all the necessary DBL source files.
Adding include files
Include files, which might contain shared code or definitions, are also added via the <ItemGroup>
element. However, because you don’t want to hand these to the DBL compiler as though they were source, you might use the <None>
or <Content>
tag:
<ItemGroup>
<None Include="includes\MyIncludeFile.dbl" />
<!-- TODO: Other include files -->
</ItemGroup>
This inclusion ensures that these files are part of the project and can be easily navigated and searched within Visual Studio but won’t be treated as top-level source files by the compiler.
Managing access to a Synergy repository
Managing common build settings
There are a few ways to manage build settings that need to be common across multiple projects. The first is to use a Directory.Build.props file. This file can be placed in the root of your repository and will be automatically included in all projects within the repository. This is a good place to put settings that are common across all projects in the repository. For example, if you want to set the default target framework for all projects in the repository to .NET 6.0, you can add the following to the Directory.Build.props file:
<Project>
<PropertyGroup>
<TargetFramework>net6.0</TargetFramework>
</PropertyGroup>
The second way to manage common build settings is to use Common.props
files. The Visual Studio integration for DBL offers direct GUI access to managing build time environment variables. Because of the particulars of how environment variables are commonly used in DBL build systems, it’s best to use a Common.props
file to manage these settings. We aren’t going to cover Visual Studio instructions here as there is a wealth of YouTube videos, articles, and documentation. Knowing that you don’t need to do this manually, you can use the following snippet to add a Common.props
file to your project at the top of your <Project
element:
<Import Project="$(SolutionDir)Common.props" />
This snippet will import the Common.props
file from the root of your solution. You can then add the following to your Common.props
file to set environment variables:
<?xml version="1.0" encoding="utf-8"?>
<Project ToolsVersion="15.0" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
<PropertyGroup>
<CommonEnvVars>EXEDIR=$(ProjectDir)..\$(Configuration)\$(Platform)\;SOMEOTHER_ENVVAR=blablabla</CommonEnvVars>
</PropertyGroup>
</Project>
This code will set the EXEDIR
and SOMEOTHER_ENVVAR
environment variables for all projects in your solution. You can then use these environment variables in your build scripts. For example, you can use the EXEDIR
environment variable to set the output directory for your build by adding/updating the following inside an active <PropertyGroup>
element:
<UnevaluatedOutputPath>EXEDIR:</UnevaluatedOutputPath>
Grouping projects into solutions
Using a solution file (.sln) in an MSBuild-based build system is required to effectively manage multiple projects. A .sln file is a text file that lists the projects that make up your solution, essentially serving as a project aggregator. It allows developers to organize, build, and manage a group of related projects as a single entity. This is particularly useful when your application consists of multiple components, such as a library, a user interface, and various service modules. Each project can be developed and maintained separately, with its own set of files, resources, and dependencies. The .sln file keeps track of these projects and their dependencies, ensuring that when you build the solution, MSBuild will compile the projects in the correct order. The flexibility of solution files extends to allowing custom-named solution configurations, within which developers can selectively determine which projects to build and can specify project-level configurations (such as Debug or Release) for each, offering a tailored and granular control over the build process. This second level of configuration is powerful, but it’s very easy to get confused if you’re not careful with your naming conventions.
Creating a solution file
You likely already have a solution file for your project. If you don’t, you can create one by using the dotnet new sln
command. This command creates a new solution file with the same name as the current directory. You can also specify a name for the solution file by using the -n
or --name
option. For example, to create a solution file named MySolution.sln
, you would use the following command:
dotnet new sln -n MySolution
Adding projects to a solution file
Once you have a solution file, you can add projects to it using the dotnet sln add
command. First, you will need to make sure your project file has explicitly specified its project type. This is going to feel a little bit like boilerplate, and it is, but doing things this way will ensure you know every part of your build system and will make it easier to maintain in the long run. You can check to see if you already have the required project type GUID by opening your project file and looking for something like the following structure:
<Project>
...
<PropertyGroup>
...
<ProjectTypeGuids>{BBD0F5D1-1CC4-42FD-BA4C-A96779C64378}</ProjectTypeGuids>
</PropertyGroup>
...
Here’s a list of the project type GUIDs and their meanings:
If you have a Traditional DBL project, you would combine the two project type GUIDs like this:
<ProjectTypeGuids>
{7B8CF543-378A-4EC1-BB1B-98E4DC6E6820};{BBD0F5D1-1CC4-42fd-BA4C-A96779C64378}
</ProjectTypeGuids>
Now that you have your project type GUIDs sorted out, in order to add a project named MyProject
to the solution file, you would use the following command from the folder where the solution file is located:
dotnet sln add path/to/MyProject.synproj
If you’re missing the project type GUIDs, you’ll get an error like this:
dotnet sln add HelloWorld.synproj
Project 'D:\repos\HelloWorld\HelloWorld.synproj' has an unknown project type
and cannot be added to the solution file. Contact your SDK provider for support.
Otherwise, you’ll see a message like this:
dotnet sln add HelloWorld.synproj
Project `HelloWorld.synproj` added to the solution.
Libraries
This section essentially combines a description of how things work and what tools are available to you with a developer version of a “choose your own adventure” story. At some point, you very likely have had at your workplace a script-driven manual traditional dbl build system that contains .dbo
, .olb
, and .elb
files in some form. Unless you are a developer who needs to deploy to OpenVMS or AIX, you can make use of MSBuild to reproduce this script-based environment to allow your IDE or CI/CD pipeline to build your projects.
Traditional DBL (manual)
Creating object libraries
Historical context: In the past, developers often created object libraries to manage multiple related external subroutines efficiently. These OLBs were a collection of compiled object files consolidated into a single, special file for easier reference during the linking process. By using an object library, developers could avoid managing and distributing numerous .dbo
files, opting instead for a single .olb
file. Once they had the .olb
file, they could link it into their mainline program using the linker. They only had to distribute the mainline .dbr
, and the referenced contents of the .olb
would be injected into the .dbr
. This was an especially common practice on OpenVMS. where .elb
files were not supported and shared images had noticeable limitations. Because linking an .olb
into a .dbr
will only include explicitly referenced routines, they are inappropriate for routines that are called dynamically at runtime, such as a global I/O hook.
Linking your program
Modern perspective: In modern development, linkers are integrated into most IDEs and build tools, making their operations more transparent to developers.
Description: The DBL linker consolidates compiled object files into a cohesive executable program. By default, it expects object files to have the .dbo
extension. The linker produces
.dbr
for executable files.map
for map files.elb
for executable libraries
Note to developers: Always recompile changed object files before relinking to ensure the updated code is incorporated into the new executable.
Creating executable libraries
Description: Unlike object libraries, executable libraries don’t include the object file in the final executable. Instead, they contain pointers to routines within the executable library. The DBL runtime uses these pointers to execute code from the library. This approach offers several advantages:
- Reduced program size: Executable libraries can decrease the size of your programs since they avoid duplicating compiled subroutines in every program.
- Time savings: If an object file within the executable library changes, there’s no need to relink all the dependent programs. The runtime uses the updated routine from the library directly.
- Efficient distribution: Updates to an application can be distributed by just replacing the executable library, eliminating the need for relinking.
- DBL linker creation: Developers can either convert existing object libraries into executable libraries or directly compile object files into an executable library.
Modern perspective: The distinction between object libraries and executable libraries might feel odd for developers familiar with dynamic (shared) libraries in languages like C or C++. However, the principle is somewhat similar to using shared libraries in other programming environments. Modern development environments often abstract these details, offering automated building, linking, and deployment. Developers new to DBL might find the distinction between object and executable libraries somewhat arcane, but understanding the historical context can provide insight into the evolution of your codebase. Next, we’ll look at how to use MSBuild to automate complex build processes, manage dependencies more efficiently, and integrate seamlessly with various development tools, providing a more scalable and maintainable approach to building software.
Traditional DBL (MSBuild)
TODO: Write this section
.NET
TODO: Write this section
Prototyping
Prototyping is one of the phases of compiling a Traditional DBL project. It uses a separate Synergy Prototype utility (known as dblproto
) to process your source code and produce type information to be used by later compilation phases. By default, DBL performs type checking for system-supplied routines and classes at compile time. This means the compiler assesses each system-supplied routine used in a program against its prototype, ensuring the proper number and type of arguments, the correct return type, and other specifics.
When working with a single compilation unit, you will always see the compiler performing type checking. If you’re not using MSBuild and building your project requires executing multiple DBL commands, you can use the dblproto
utility to facilitate validation.
When writing for .NET, compile-time type checking is always in effect and is slightly more strict about what sorts of things can be implicitly converted.
Compile-time type checking:
Compile-time type checking is the process by which a compiler verifies type constraints and ensures that operations performed on variables are semantically correct and type-safe. In programming languages with strong static typing, like Java, C++, and Swift, the compiler will scrutinize the code for type mismatches or incompatible operations during the compilation process, well before the code runs.
In early versions of DBL, if a routine was not previously declared and its name was encountered in a function call, it was implicitly considered to be a routine that returned a ^VAL type, with no assumptions about its arguments. If it was preceded by XCALL, there would be no return type, but the arguments were treated the same as with a function. In this second scenario, the compiler wouldn’t conduct compile-time validity checks on the quantity or types of arguments passed to it.
How it works:
When the compiler encounters a piece of code, it evaluates each variable, function, and operation based on its declared or inferred type. For example, in the statement data x, int, 5
, the compiler understands that x
is of type int
and should only hold integer values. If later in the code there’s an attempt to assign a string to x
(e.g., x = "hello"
), the compiler will flag it as a type error. Similarly, the compiler checks function calls to ensure that the right number and types of arguments are passed and checks if the return types of functions are used correctly.
Advantages:
-
Early error detection: One of the most significant benefits is that many errors, particularly those involving data type mismatches or misused operations, are caught during the compilation process. This early detection can save developers a lot of time, as these errors can be harder to identify and diagnose if they only manifest during runtime.
-
Optimization: Knowing the data types in advance allows the compiler to make various optimizations that can make the final code run faster and be more efficient.
-
Code readability and maintenance: Strongly-typed code can be more explicit, making it easier for developers to understand the intended use and function of different variables and functions.
-
Security: Type-related errors, if undetected, can lead to various vulnerabilities in software. Catching type mismatches during compilation can prevent some of these potential security risks.
How to use it:
If you’re building using MSBuild, this prototyping is mostly handled for you automatically. Projects produce prototype files (with a .dbp file extension), project references consume those prototype files, and the compiler ensures that any symbols or routines it knows about are correctly used. In .NET, this coverage is complete; there isn’t a way to accidentally use a function that the compiler doesn’t have a definition for. In Traditional DBL, though, if a subroutine is undefined, by default the compiler will not complain. This behavior was chosen because most codebases contain some circular references or other things that prevent complete prototyping. If you want to enforce complete prototyping in Traditional DBL, you can use the -qreqproto
compiler option. This can be passed to DBL on the command line, set inside your MSBuild project directly as <DBL_qReqProto>True</DBL_qReqProto>
inside a <PropertyGroup>
, or set in the Compile tab of your Visual Studio project properties.
If you’re building using dbl
from a script or directly on the command line, you will need to run dblproto
prior to dbl
. At its most basic, the flow looks like this:
dblproto -out=my_prototype_file.dbp my_source_file.dbl my_other_source_file.dbl
dbl -qimpdbp=my_prototype_file.dbp my_source_file.dbl
dbl -qimpdbp=my_prototype_file.dbp my_other_source_file.dbl
dblink -o my_program.dbr my_source_file my_other_source_file
The two invocations of dbl have been split to illustrate the prototype file, but with implicit prototyping, if you instead call it like this, it would be functionally equivalent:
dbl -o my_program.dbo my_source_file.dbl my_other_source_file.dbl
dblink -o my_program.dbr my_program.dbo
Keep in mind that if you’re integrating dblproto
into an existing build script, you may need to supply many of the same compiler options to dblproto
that you normally supply to dbl
.
Unprototyped code:
Before dblproto
was created in version 9 of DBL, the closest thing to compile-time type checking was the external function declaration. This was only for functions and only defined the return type of a function so it would not result in any substantial compile-time type checking. Now that DBL has real compile-time type checking, these external function declarations are ignored by the compiler, but you may still see them in your code base.
TODO More setup. The reader needs to understand that compile-time type checking is good and they need strategies that will allow them to incrementally move towards this goal. Might need to cover some of the qrelaxed options here as well as provide examples of code that compiles without dblproto but fails with it. Effort here will pay off in one of the major areas of issue in leg mod.
Rules for unprototyped code:
What happens when the Traditional DBL compiler doesn’t know the type of an argument? For starters, it’s going to assume that the argument is a descriptor type of some sort, meaning a
, d
, id
, or i
. Because descriptors carry some type information with them at runtime, all is not lost if you’ve passed the wrong type into a routine. It now depends on what you try to do with that argument. Operations involving a transition from d
to a
are relatively safe for positive numbers, but you will likely get an unexpected result if you store a wrongly typed negative d
into an a
. Going the other direction, a
into d
, will have pretty terrible results if that alpha contains anything other than ASCII decimal characters. The store may work, but performing arithmetic operations on it will cause an exception. Improperly typed i
usage will result in binary data in whatever destination you’re storing into. Type id
has the same restrictions about negative numbers but with the added problem that the decimal point will be missing depending on how you use it.
What to do about parameter type errors in older code:
The MISMATCH
modifier provides some flexibility in argument type matching during the compile-time checks. When applied to n
, n.
, a
, or d
parameter types, it permits an alpha type argument to be passed to a numeric type without triggering a prototype mismatch error. For both subroutines and functions, when MISMATCH
is used with an a
parameter type, it allows the passage of a decimal or implied-decimal argument to an alpha parameter.
However, there are nuances to be aware of. When using MISMATCH
with an n
parameter, you must exercise caution due to differences in behavior between Traditional DBL and DBL .NET. Specifically, in Traditional DBL, an alpha argument passed to a MISMATCH n
is treated as decimal. In contrast, DBL running on .NET retains the alpha type for the passed argument. This distinction can lead to varied outcomes. If your intention is to pass an alpha to a routine with an n
argument, and you don’t make use of ^DATATYPE and ^A, you should opt for ^D rather than setting the routine to MISMATCH n
.
For situations where you’re using a MISMATCH a
argument but expecting to access a d
parameter as an alpha, ensure you use ^DATATYPE and cast with ^A.
If your goal is to pass a decimal variable to a routine with an alpha-typed parameter without employing ^DATATYPE for casting, using MISMATCH a
is not a good idea. A better approach is to change the routine to accept a numeric parameter, label it as MISMATCH n
, and judiciously use ^DATATYPE and ^A when necessary.
Reinterpreting:
The following operators do not convert the expression into an alpha, integer, decimal, or implied-decimal value. Instead, they only modify how the data is referenced in that particular instance. The underlying data remains intact.
^A(expression)
Accesses the underlying data as though it were an alpha. This is especially handy when dealing with decimal data in file I/O operations.^D(expression[, precision])
Accesses the underlying data as though it were a decimal or implied decimal, depending on whether you’ve passed the precision argument.^I(expression)
Accesses the underlying data as though it were an integer. This is not usually very useful and should not really be done unless you know someone else put integer data there.
Example
record
decimalData, a*, "1234"
aFld, a*, "abcd"
dFld, d*, 1234
ifld, i4, 1234
proc
Console.WriteLine(^a(dFld))
Console.WriteLine(^a(-dFld))
Console.WriteLine(^a(ifld))
;this example uses %string to make output easier
Console.WriteLine(%string(^d(aFld)))
Console.WriteLine(%string(^d(decimalData)))
Console.WriteLine(%string(^d(decimalData, 2)))
Console.WriteLine(%string(^d("-" + decimalData)))
Console.WriteLine(%string(^i(aFld)))
Output
1234 123t ╥ -1234 1234 12.34 =1234 1684234849
Converting: The following functions will convert the expression into the target type, applying their respective rules while they do it.
IMPLIED(expression)
Converts an expression to an implied-decimal value.- If the expression is of integer or decimal type, the resulting value will lack fractional precision.
- If the expression is alpha, the fractional precision will match the number of digits to the right of the decimal point in the expression.
- Rounding and truncation are applied for alpha variables containing implied-decimal values with more than 28 digits to the right of the decimal point.
STRING(expression)
Converts a numeric value to its string representation.- Converts the given value to a nonblank string based on rules for moving data to an alpha destination. The optional format can alter the representation.
- Particularly useful for displaying integer values, as it genuinely converts the data, unlike ^A.
INTEGER(expression)
Converts an expression to an integer value.- The expression is changed following the rules for moving data to an integer destination.
- If the transformed expression doesn’t fit the requested integer size, the high-order bits are discarded.
- By default, the output is four bytes long, unless the expression magnitude specifies otherwise.
Example
Because the implementation of Console.WriteLine
in Traditional DBL is limited compared to .NET, we will call %STRING to pass the correct type to Console.WriteLine
. This will, of course, show the somewhat mangled output of some of these conversions.
record
decimalData, a*, "1234"
aFld, a*, "abcd"
dFld, d*, 1234
ifld, i4, 12345678
proc
Console.WriteLine(%string(dFld, "$$$,$$$.XX"))
Console.WriteLine(%string(ifld, "$$$,$$$.XX"))
Console.WriteLine(%string(%implied(decimalData) + 4321))
Console.WriteLine(%string(%implied(ifld) + 87654321))
Console.WriteLine(%string(%integer(decimalData)))
Console.WriteLine(%string(%integer(dFld)))
Console.WriteLine(%string(%integer("99999999999999", 2)))
Console.WriteLine(%string(%integer("99999999999999", 4)))
Console.WriteLine(%string(%integer("99999999999999", 8)))
Console.WriteLine(%string(%size(%integer("1234"))))
Output
$12.34 123,456.78 5555 99999999 1234 1234 16383 276447231 99999999999999 4
Following the Trend: An Inventory Project
Time for another project! This time we’ll be creating a routine that collects ordering trends and makes recommendations to restock our inventory. This project is going to make use of a lot of the concepts we’ve covered so far. You can do all of this using Visual Studio and MSBuild projects, but we’re going to do this directly using dbl
, dblink
, and rpsutl
to give you a feel for how these tools are working under the hood. If you want to see how things are set up using MSBuild, you can check this book’s companion repository on GitHub.
Overview
The GenerateRestockReco subroutine is the core of the recommendation system. It takes in inventory
, orders
, and analysisDate
as inputs and outputs trends
and recommendations
. We’ll need some fake data and a few helper routines to hold it all together. Let’s start by defining the data structures we’ll need.
Required data structures
- InventoryItem: Represents an item in inventory.
- Order: Represents a customer order.
- Trend: Used for analyzing sales trends of an item.
- Restock: Represents a restock recommendation.
We’re going to use the Synergy Data Language to define these structures and build them into a repository. This will give you a chance to work with a brand new repository and really get a feel for how it works. We’ll be using the rpsutl
tool to build the repository and .INCLUDE to bring the structures into our program. Time to dive in!
Repository
Up first, we have DBL code for the data structures we’re going to use. Then, we’re going to define these in your repository using SDL.
structure InventoryItem
ItemId, a10
Name, a40
Quantity, i4
endstructure
structure Order
OrderId, a10
ItemId, a10
Quantity, i4
OrderDate, d8
endstructure
structure Trend
ItemId, a10
ItemCount, i4
OrderCount, i4
HistoricCount, i4
endstructure
structure Restock
ItemId, a10
Quantity, i4
endstructure
You’ll notice that these structures all have the ItemId
field in common. This is going to be how we keep track of which items are which. We’ll use ItemID
to join the data together later on, and We’ll also use it to look up the item name in the InventoryItem
structure. This is a common pattern that you’ll see frequently in the wild, and it’s a good idea to keep it in mind when you’re designing your own data structures. Now let’s take a look at the SDL for these structures.
STRUCTURE InventoryItem DBL ISAM
DESCRIPTION "Inventory item details"
FIELD ItemId TYPE ALPHA SIZE 10
FIELD Name TYPE ALPHA SIZE 40
FIELD Quantity TYPE INTEGER SIZE 4
END
STRUCTURE Order DBL ISAM
DESCRIPTION "Order details"
FIELD OrderId TYPE ALPHA SIZE 10
FIELD ItemId TYPE ALPHA SIZE 10
FIELD Quantity TYPE INTEGER SIZE 4
FIELD OrderDate TYPE DECIMAL SIZE 8
END
STRUCTURE Trend DBL ISAM
DESCRIPTION "Trend analysis data"
FIELD ItemId TYPE ALPHA SIZE 10
FIELD ItemCount TYPE INTEGER SIZE 4
FIELD OrderCount TYPE INTEGER SIZE 4
FIELD HistoricCount TYPE INTEGER SIZE 4
END
STRUCTURE Restock DBL ISAM
DESCRIPTION "Restock information"
FIELD ItemId TYPE ALPHA SIZE 10
FIELD Quantity TYPE INTEGER SIZE 4
END
The SDL definitions look very similar to the DBL code, but there are a few differences. First, we’ve added a DESCRIPTION
to each structure. This is a good practice to get into, because it makes it easier for other developers to understand what the structure is for. It can also make it easier to generate documentation for your code. The syntax for the fields is a little different and is generally much more verbose than the DBL version. Now that we have our structures defined, we need to build them into a repository. To start, let’s get a directory made for this project, call it “trend,” and put it somewhere appropriate. Next, we’ll write the SDL code above to a file named repository.scm
in your project directory. From inside a command prompt that has the DBL environment set up, while inside your project folder, run the following command:
set RPSMFIL=%CD%\rpsmain.ism
set RPSTFIL=%CD%\rpstext.ism
dbs RPS:rpsutl -i repository.scm -ir
First, we’re going to set RPSMFIL
and RPSTFIL
to the current directory. This is going to tell rpsutl
where to put the repository files, and later on it will tell dbl
where to find the repository. Next, we’re going to run rpsutl
with the -i
flag to tell it to build a repository from the SDL file. The -ir
flag tells rpsutl
to rebuild the repository if it already exists. Keep this console open; we’re going to using it again in a minute. If you look in your project directory, you should see a few new files. These are the repository files that rpsutl
generated for us. Now that we have a repository, if we want to use it in our program, we need to tell the compiler about it. When we get to writing the bulk of this program, we’re going to use an .INCLUDE directive. Let’s take a look at how that will work.
.include "InventoryItem" REPOSITORY, structure, end
.include "Order" REPOSITORY, structure, end
.include "Trend" REPOSITORY, structure, end
.include "Restock" REPOSITORY, structure, end
.INCLUDE can also be used to insert the contents of files on disk, but in this case we have the REPOSITORY keyword to tell the compiler that we want to include from the repository we just built. The STRUCTURE keyword tells the compiler that we want it to generate a structure instead of the default of a record. The END keyword tells the compiler that we want it to put END at the end of the included code. Things like records don’t actually need an END keyword, but global structures do. Time to move on now—we have code to write!
Recommendations
Now that we have our data structures in order, let’s move on to the GenerateRestockReco
subroutine. GenerateRestockReco
is supposed to analyze inventory and sales data, ultimately providing recommendations for restocking inventory items. We’re going to show off our knowlege of collections, control flow, and comparisons. Let’s explore this subroutine in detail, understanding its intricacies and the role of each segment in achieving its objective.
Subroutine signature
.include "InventoryItem" REPOSITORY, structure, end
.include "Order" REPOSITORY, structure, end
.include "Trend" REPOSITORY, structure, end
.include "Restock" REPOSITORY, structure, end
subroutine GenerateRestockReco
in inventory, @ArrayList
in orders, @ArrayList
in analysisDate, d8
out trends, @ArrayList
out result, @ArrayList
endparams
;;data division records go here
record
oneMonth, int
oneYear, int
orderTrends, @StringDictionary
proc
;; implementation goes here
endsubroutine
Initialization and preparations
result = new ArrayList()
trends = new ArrayList()
oneMonth = %jperiod(analysisDate) - 30
oneYear = %jperiod(analysisDate) - 365
orderTrends = new StringDictionary()
In this initial section, the subroutine sets up necessary variables. Variables oneMonth
and oneYear
are calculated to represent the date thresholds for recent and historical data analysis. The %JPERIOD function converts analysisDate
into a Julian date format, from which 30 and 365 days are subtracted to get dates one month and one year prior, respectively. The variable orderTrends
is initialized as a StringDictionary
, which will be used to map item IDs to their corresponding sales trend data.
Processing orders
foreach data targetOrder in orders as @Order
begin
data orderDate, int, %jperiod(targetOrder.OrderDate)
data targetTrend, Trend
init targetTrend
targetTrend.ItemId = targetOrder.ItemId
if (orderTrends.Contains(targetOrder.ItemId))
begin
targetTrend = (Trend)orderTrends.Get(targetOrder.ItemId)
end
if(orderDate >= oneMonth) then
begin
targetTrend.ItemCount += targetOrder.Quantity
targetTrend.OrderCount += 1
end
else if(orderDate >= oneYear) then
begin
targetTrend.HistoricCount += targetOrder.Quantity
end
else
nextloop
orderTrends.Set(targetOrder.ItemId, (@*)targetTrend)
end
In this block, the subroutine iterates through each order in the orders
ArrayList. Because ArrayList only understands an untyped object
, we tell the compiler what the expected type is using the AS syntax. For each order, the routine calculates the Julian date (orderDate
) and initializes a Trend
structure (targetTrend
). This structure is then populated with data depending on whether the order falls within the one-month or one-year threshold. Orders within these thresholds contribute to the item count and order count or the historic count, reflecting recent and historical sales data. The updated trend data for each item is stored back into the orderTrends
dictionary.
Analyzing inventory and creating recommendations
foreach data targetItem in inventory as @InventoryItem
begin
if (orderTrends.Contains(targetItem.ItemId))
begin
data targetTrend = (Trend)orderTrends.Get(targetItem.ItemId)
data historicCount, d28.10, targetTrend.HistoricCount
data recentQuantity, d28.10, targetTrend.ItemCount
data historicalAverage, d28.10, historicCount / 12.0
if (recentQuantity > 1.5 * historicalAverage ||
& targetItem.Quantity < historicalAverage)
begin
data restockRequest, Restock
restockRequest.ItemId = targetItem.ItemId
restockRequest.Quantity = %integer(recentQuantity > historicalAverage ?
& recentQuantity - targetItem.Quantity:
& historicalAverage - targetItem.Quantity)
if(restockRequest.Quantity > 0)
result.Add((@*)restockRequest)
end
trends.Add((@*)targetTrend)
end
end
This segment iterates over each item in the inventory. For items that have corresponding trend data in orderTrends
, the subroutine calculates the historical average sales and compares that value with the recent sales quantity. If the recent sales exceed 1.5 times the historical average or if the current inventory is below the historical average, a restock request is created. This request includes the item ID and the calculated restock quantity, which is determined based on whether the recent sales or the historical average is greater. Restock requests with a positive quantity are added to the result
ArrayList, which contains all recommendations. Additionally, the trend data for each item is added to the trends
ArrayList for potential further analysis.
Test data
We’re going to need to build some test data to run this routine against. We’ll need a few helper routines to accomplish this. Let’s start with the MakeItem
function. This function will take in an item ID, name, and quantity and return an InventoryItem
structure, which we’ll use to populate our inventory.
function MakeItem, InventoryItem
itemId, n
name, string
quantity, int
endparams
record
inv, InventoryItem
proc
inv.ItemId = %string(itemId)
inv.Name = name
inv.Quantity = quantity
freturn inv
endfunction
There’s not really anything new here; we’re just taking in some parameters and returning a structure. This saves us a few repeated lines later on since we’re going to create a few items. Next, we’ll need a function to build orders. This function will take in an item ID, quantity, order count, and order date. It will then generate orders for that item for the specified quantity, count, and date. We’ll use what’s generated to populate a large number of orders.
subroutine BuildOrders
itemId, n
quantity, int
orderCount, int
date, d8
orders, @ArrayList
endparams
record
ord, Order
i, int
proc
for i from 1 thru orderCount by 1
begin
ord.ItemId = %string(itemId)
ord.OrderDate = date
ord.OrderId = %string(orders.Count)
ord.Quantity = quantity
orders.Add((@Order)ord)
end
xreturn
endsubroutine
The above code is doing just a little bit more than the MakeItem
function. Because we’re going to be generating a lot of orders, we’re going to use a FOR loop to do it. We’re also going to be adding these orders to an ArrayList` so we can keep track of them. Now that we have our helper routines, let’s build some test data.
main
record
restockRequests, @ArrayList
trends, @ArrayList
inventory, @ArrayList
orders, @ArrayList
analysisDate, d8
proc
inventory = new ArrayList()
orders = new ArrayList()
;;this is YYYYMMDD for September 28th, 2023
;;this is the format that %jperiod expects
analysisDate = 20230928
;;Let's be explicit and box the returned structures
inventory.Add((@InventoryItem)MakeItem(1, "widget", 50))
inventory.Add((@InventoryItem)MakeItem(2, "doodad", 25))
inventory.Add((@InventoryItem)MakeItem(3, "thingy", 100))
inventory.Add((@InventoryItem)MakeItem(4, "whatchacallit", 0))
;;all of these dates are YYYYMMDD
BuildOrders(1, 5, 20, 20230928, orders)
BuildOrders(1, 3, 20, 20230728, orders)
BuildOrders(1, 3, 20, 20230503, orders)
BuildOrders(1, 3, 20, 20230328, orders)
BuildOrders(2, 5, 20, 20230928, orders)
BuildOrders(3, 5, 20, 20230404, orders)
BuildOrders(4, 5, 2000, 20230104, orders)
endmain
Most of this code is starting to be old hat, but there are a few things to call out. Because ArrayList only understands Object
, we need to box our structures by casting them to @InventoryItem
or @Order
. You may see other developers write this as (@*)
, which is shorthand for casting to Object
, but the problem with that is you don’t really know if that is going to result in a boxed alpha representation of your structure or an actual boxed InventoryItem
. The practical effect of this distinction is minimal in Traditional DBL, but if you’re running on .NET, the type system is much more strict. We’re also going to be using the BuildOrders
function to generate a lot of orders into our orders
variable.
Generating restock recommendations
Time for the moment of truth. Let’s add our GenerateRestockReco
subroutine to the bottom of our main and see what we get.
GenerateRestockReco(inventory, orders, analysisDate, trends, restockRequests)
This call to GenerateRestockReco
is going to process the inventory and order data to generate restocking recommendations. The subroutine outputs two ArrayLists: trends
, which holds trend data for each item, and restockRequests
, which contains the restocking recommendations. Let’s go through the steps to build this project and see what happens.
Building the project
Make sure you have the DBL environment set up and you’re still in the project directory. Additionally, make sure you’ve still got RPSMFIL and RPSTFIL set from the repository section.
We’re making use of the StringDictionary that we built back in Dictionary, and to use that, we’re going to need to write that source to a file in our project directory. I’ve named it StringDictionary.dbl
, and also I’ve named the source we were just working on Program.dbl
. Now run the following command:
dbl Program.dbl StringDictionary.dbl
dblink Program
dbs Program
The first two commands should complete without error. If you get an error from the dbl
compile step, make sure your repository is set up, that you have the most recent version of DBL installed, and that you’ve written both Program.dbl and StringDictionary.dbl to the project directory. The dbs
command will run the program and output the results to the console. You should see something like this:
%DBR-S-STPMSG, STOP
We didn’t get any output! What happened? Well, we didn’t tell the program to output anything. Let’s add some code to do that.
Outputting restock recommendations
After the call to GenerateRestockReco
, add the following code:
foreach data restockReq in restockRequests as @Restock
begin
;;We can go grab the item from inventory and show the name
data item = (@InventoryItem)inventory[%integer(restockReq.ItemId) - 1]
Console.WriteLine("restock request for " + %atrim(item.Name) +
& " with quantity " + %string(restockReq.Quantity))
end
This loop iterates over the restockRequests
ArrayList. For each restock request, it retrieves the corresponding inventory item to display the item’s name and the recommended restock quantity. This is a very simple way to output the recommendations, but at least we can see that the subroutine is working. Let’s build and run the program again.
dbl Program.dbl StringDictionary.dbl
dblink Program
dbs Program
This time we should see some output:
dbs Program
restock request for widget with quantity 50
restock request for doodad with quantity 75
restock request for whatchacallit with quantity 833
%DBR-S-STPMSG, STOP
We can see that the subroutine is working as expected. It’s recommending restocking for the widget, doodad, and whatchacallit items. The widget and doodad items have been selling well recently, and the whatchacallit item has been selling well historically. The subroutine is also recommending a higher quantity for the whatchacallit item because it has a history of massive sales spikes. Although this is a very simple example, it shows how we can use the tools we’ve learned to build a useful routine. We can do better though. In the next section, we’re going to explore how we can use the trends
ArrayList to generate a very simple PDF report of the sales trends for each item.
Reporting
It’s time to do something interesting with this trend data that we collected in the preceding section. Synergex has a wrapper library for the Haru PDF library that we can use to generate a PDF report of our sales trends. To start, we’re going to need to download the source for the wrapper library. You can find it on GitHub at PDFKit and download it to your project directory. Next, we’ll need to grab the built libraries for our platform here. Download the three DLLs and put them in your project directory too. Now we’re ready to start writing our report.
New imports
At the top of Program.dbl
, add the following imports:
import HPdf
New subroutine signature
Next, we’re going to add a new subroutine to generate the report. Add the following code to the bottom of Program.dbl
:
subroutine GenerateSalesTrendsReport
in trends, @ArrayList
in inventory, @ArrayList
endparams
record
pdf, @HPdfDoc
page, @HPdfPage
font, @HPdfFont
yPos, float
proc
;; implementation goes here
endsubroutine
Our routine needs the trends we’ve collected, and we’re also going to use the inventory list to get the item names.
Initialization of the PDF
Inside the procedure division, we’re going to start by initializing the PDF document using the following code:
pdf = new HPdfDoc()
This is going to create a new instance of HPdfDoc
, which is a class from the Haru PDF library. This instance represents the PDF document we’re going to create and modify.
Adding a new page and setting up
Next we’re going to add the first page to the document and set the font using the following code:
page = pdf.AddPage()
page.SetSize(HPdfPageSizes.HPDF_PAGE_SIZE_A4, HPdfPageDirection.HPDF_PAGE_PORTRAIT)
The size of the page is set to A4, and the orientation is defined as portrait. Although this is the default page size and orientation, it’s good practice to set it explicitly.
Setting up the Font
To add text to the page, we need to tell Haru what font to use. We’re going to use the following code to set the font:
font = pdf.GetFont("Helvetica", ^null)
page.SetFontAndSize(font, 12)
We’ve chosen the oh-so-trendy Swiss font Helvetica with a size of 12. This font setting will apply to all the text added to the PDF pages.
Initializing the Y position for text
To support multiple pages, we need to keep track of the vertical position of the text on the page. We’ll do this using the following code:
yPos = page.GetHeight() - 50
The yPos
variable is initialized to manage the vertical position of text on the page. This line positions the first line of text 50 units from the top of the page.
Looping through the trends array list
Now we’ve gotten to the interesting part: we’re going to loop through the trends array list and add each trend to the PDF document. I’ve broken this up a little bit for readability, but we’re going to fill in this loop in the next step. For starters, let’s get the loop in place using the following code:
foreach data trend in trends as @Trend
begin
data item = (@InventoryItem)inventory[%integer(trend.ItemId) - 1]
;;process each trend
end
In this loop, each Trend
object in the trends
ArrayList is processed. For each trend, the corresponding InventoryItem
is retrieved from the inventory
ArrayList. Doing this allows the item name to be included in the report.
Adding the items in our trends loop
Inside the loop, we’re going to actually add the items to the PDF document using the following code:
if (yPos < 50)
begin
page = pdf.AddPage()
page.SetSize(HPdfPageSizes.HPDF_PAGE_SIZE_A4,
& HPdfPageDirection.HPDF_PAGE_PORTRAIT)
page.SetFontAndSize(font, 12)
yPos = page.GetHeight() - 50
end
page.BeginText()
page.MoveTextPos(50, yPos)
page.ShowText("Item: " + %atrim(item.Name) +
& ",Item Count: " + %string(trend.ItemCount) +
& ", Order Count: " + %string(trend.OrderCount) +
& ", Historic Count: " + %string(trend.HistoricCount))
page.EndText()
yPos -= 20
Here, the subroutine checks if the current page has enough space for more text. If yPos
is less than 50, which means the page is nearly full, a new page is added with the same settings as before. The routine then adds text to the page, including the item’s name and its sales trend data. After each entry, yPos
is adjusted to move to the next line, ensuring proper spacing between lines of text.
Saving and closing the PDF document
Now, outside of the trends loop, we’re going to save and close the PDF document using the following code:
pdf.SaveToFile("SalesTrendsReport.pdf")
pdf.FreeDoc()
After all trends are processed and added to the document, the PDF is saved to a file named “SalesTrendsReport.pdf”. The FreeDoc
method is then called to release the resources associated with the PDF document.
Calling the subroutine
Now that we’ve written the subroutine, we need to call it. Add the following code to the bottom of main
:
GenerateSalesTrendsReport(trends, inventory)
Building and running the program
dbl Program.dbl StringDictionary.dbl pdfdbl.dbl
dblink Program
dbs Program
This time we should see the same output as before, but we should also see a new file in our project directory named SalesTrendsReport.pdf
.:
dbs Program
restock request for widget with quantity 50
restock request for doodad with quantity 75
restock request for whatchacallit with quantity 833
%DBR-S-STPMSG, STOP
Open the newly created pdf, and it should have text that looks like this:
Item: widget, Item Count: 100, Order Count: 20, Historic Count: 180
Item: doodad, Item Count: 100, Order Count: 20, Historic Count: 0
Item: thingy, Item Count: 0, Order Count: 0, Historic Count: 100
Item: whatchacallit, Item Count: 0, Order Count: 0, Historic Count: 1000
If you got error text that looks like this:
%DBR-E-DLLOPNERR, DLL could not be opened: libhpdf64.dll
%DBR-I-ERTXT2, System err: (126) The specified module could not be found.
you probably don’t have all three of the DLLs in your project directory. Go back and make sure all of the DLLs are in your directory and that they are bit-size/platform matched. For example, I ran dblvars64.bat in my command prompt, so I’m running 64-bit Windows; therefore, when I downloaded the DLLs from GitHub, I grabbed the DLLs in the Windows64 directory. If you ran dblvars32.bat, you would need to grab the DLLs from the Windows32 directory. Hopefully everything is running now, and you can see the PDF report. If you’re having trouble, you can check the companion repository on GitHub for the completed code.
Testing
Unit testing, integration testing, and end-to-end testing each play their own distinct role in producing high-quality software but differ significantly in scope, focus, and complexity.
Unit testing is the most granular form of testing, focusing on individual units or components of the software in isolation, such as functions or methods. The primary goal is to ensure that each part of the code performs as expected, facilitating easy debugging by isolating the source of a bug. Unit tests are narrow in scope, testing only a small part of the application, and are typically fast and easy to maintain due to their isolated nature. Tools like MSTest for .NET and Synergex Test Framework for Traditional DBL are commonly used for unit testing.
Integration testing, on the other hand, examines the interaction between integrated units, or modules, of the software. This type of testing is concerned with data flow and control flow among modules, ensuring that combined parts of the application work together as intended. The scope of integration testing is wider than unit testing, and it often reveals issues related to module interfaces and interaction. While more complex than unit testing due to the dependencies between components, integration testing is crucial for catching integration-related bugs. Tools like Postman for API testing, TestComplete, and Selenium for UI testing are often used in integration testing.
End-to-end testing takes the broadest approach, encompassing the entire application in a scenario that mimics real-world usage. It’s concerned with the flow of the application from start to finish, ensuring that the system meets external requirements and user expectations. End-to-end tests validate the system’s behavior and performance in a production-like environment, making them the most complex and the slowest to execute. They also tend to be the most challenging to maintain due to their dependency on the entire application and its environment.
In essence, while unit testing ensures the reliability of individual components, integration testing verifies the interactions between these components, and end-to-end testing validates the overall functionality and user experience of the entire system. Each type of testing serves a distinct and vital role in the software development lifecycle, contributing uniquely to the overall quality and reliability of the software. In this chapter when we talk about testing, we’re going to be focusing on unit testing. In the next section, we’ll look at some specific strategies for making code more testable.
Testable Code
We sometimes take for granted the idea that code should be testable, but what does that really mean? We’re going to dig into a few ways that we can incrementally improve the testability of our codebases. This section still matters to you if you’re writing brand new code! You can use these techniques to make your code more testable from the start.
Parameterize all the things!
Passing data through parameters, rather than using globals or data in files, is our starting point for testable code. It’s not uncommon (pardon the pun) to see code that uses commons/globals to control the operation of subroutines/functions that are called. This is a very common pattern in legacy codebases and is a major source of pain when trying to test the code. Let’s take a look at a simple example to show you what I mean:
main
record
global common
fld1, d10
endcommon
record
result, i4
proc
fld1 = 5
xcall increment()
Console.WriteLine(result)
endmain
subroutine increment
common
fld1, d10
endcommon
proc
fld1 += 1
xreturn
endsubroutine
Why didn’t everyone just parameterize everything from the start? Well, there are a few reasons, but the big ones are solved now. Before the DBL compiler had support for strong prototyping, you had no way to reliably detect that the parameter list for a routine had changed. This meant that it was potentially a massive effort to grep through your entire codebase to find all the places that called a routine and potentially update the parameter list. This operation was potentially recursive, as you might need to update the parameter list of the routines that called the routine you were updating. Given the pressure to ship working software, this situation was often avoided by just putting tons of stuff into global/common data. Today, not only can the compiler tell you if you’ve missed a parameter, but navigating the codebase to find and update the calling routines is trivial with Visual Studio.
Let’s take a look at a few of the benefits of parameterizing your code:
-
Isolation of functions/methods: Parameters allow functions or methods to operate in isolation. They don’t rely on external state (like global variables) or external resources (like files), making them predictable and consistent. This isolation simplifies the testing process because you only need to consider the inputs and outputs of the function, not the state of the entire system.
-
Control of test conditions: When you use parameters, you can easily control the input for testing purposes. This enables you to create a wide range of test cases, including edge cases, without needing to manipulate global state or file contents, which can be cumbersome and error-prone.
-
Reduced side effects: Global variables and file operations often lead to side effects, where a change in one part of the system unexpectedly affects another part. By using parameters, you minimize these side effects, making it easier to understand and test each part of the code in isolation.
-
Easier mocking and stubbing: In unit testing, it’s common to use mocks or stubs to simulate parts of the system. Parameters make it easier to inject these mocks or stubs, as you’re simply passing different values or objects through the parameters. This is more complex with global variables or file-based inputs, as it might require changing the global state or file content for each test.
-
Concurrency and parallel testing: When tests rely on parameters rather than shared global state or files, they can be run in parallel on .NET without the risk of interfering with each other. This is crucial for efficient testing, especially in large codebases or when running tests in a continuous integration environment.
-
Documentation and readability: Functions that explicitly declare their inputs through parameters are generally more readable and self-documenting. This clarity is beneficial for testing, as it makes it easier to understand what a function does and what it needs to be tested with.
-
Refactoring and maintenance: When your code relies on parameters rather than globals or files, it’s often easier to refactor. You can change the internals of a function without worrying about how it will impact the rest of the system, which is particularly important when maintaining and updating older code.
Parameterizing existing code
Hopefully you’re convinced that parameters are preferable to globals, but you have a massive codebase and you’re not sure where to start. The good news is that you can start small and work your way up. Let’s talk about using adapters and bridges to incrementally improve the testability of your codebase.
Before The caller communicates with the original function through a combination of zero or more parameters plus global state in the form of a GDS or COMMON.
Bridge functions In a bridge function, the main idea is to connect global state to a more modular, parameterized function. The bridge function handles the passing of global state as parameters.
Adapter functions An adapter function is used to adapt the interface of one function to another. It takes parameters from the caller, modifies the global state, and then calls your original parameterless function.
Both adapter functions and bridge functions fall under the category of wrappers. They wrap around the original function, providing a controlled interface to it. The difference is that a bridge function is a wrapper that takes global state as parameters, whereas an adapter function is a wrapper that takes parameters from the caller and modifies global state.
An example
We’re going to start with a couple of small functions that rely on global state, and then we’ll show both approaches to parameterizing them.
subroutine my_interesting_routine
out result, n
common
fld1, d10
fld2, d10
proc
xcall another_routine()
result = fld1 + fld2
xreturn
endsubroutine
subroutine another_routine
common
fld1, d10
fld2, d10
record
tmp, i4
proc
tmp = fld1 + fld2
if((tmp .band. 1) == 1)
fld1 = fld1 + 1
xreturn
endsubroutine
main
record
global common
fld1, d10
fld2, d10
endcommon
record
result, i4
proc
fld1 = 5
fld2 = 9000
xcall my_interesting_routine(result)
Console.WriteLine(result)
endmain
This is obviously a silly example, but it will demonstrate many of the problems we have to solve when refactoring legacy code. Let’s start by creating a parameterized version of my_interesting_routine
:
subroutine my_interesting_routine_p
in fld1, d10
in fld2, d10
out result, n
proc
xcall another_routine()
result = fld1 + fld2
xreturn
endsubroutine
This is a good start, but we still have a problem: another_routine
is still using global state. Let’s try making a parameterized version of another_routine
:
subroutine another_routine_p
inout fld1, d10
in fld2, d10
record
tmp, i4
proc
tmp = fld1 + fld2
;;if tmp is an odd number
if((tmp .band. 1) == 1)
fld1 = fld1 + 1
xreturn
endsubroutine
If you try to call another_routine_p
from my_interesting_routine_p
, you’ll get a compiler error. The compiler is telling you that you can’t pass an in
parameter to an inout
parameter. This is a good thing! The compiler is telling you that you’re trying to change the value of a parameter that you’ve told it is read only. This is a great example of how the compiler can help you write better code.
subroutine my_interesting_routine_p
inout fld1, d10
in fld2, d10
out result, n
proc
xcall another_routine_p(fld1, fld2)
result = fld1 + fld2
xreturn
endsubroutine
subroutine another_routine_p
inout fld1, d10
in fld2, d10
record
tmp, i4
proc
tmp = fld1 + fld2
if((tmp .band. 1) == 1)
fld1 = fld1 + 1
xreturn
endsubroutine
Okay, now the parameterized version of my_interesting_routine
can be used as a functional equivalent to the original. Let’s wrap it with our bridge function:
subroutine my_interesting_routine
out result, n
common
fld1, d10
fld2, d10
proc
xcall my_interesting_routine_p(fld1, fld2, result)
endsubroutine
That’s not so hard, but notice that this method of wrapping our global state away is somewhat viral. We had to change the original function to call the parameterized version, and then we had to change the parameterized version to call the parameterized version of the other function. If we had tried to mix a parameterized version with the original or with a wrapper, it would have been very easy to subtly change the behavior of the code, introducing hard-to-track-down bugs. If you’re going to use bridge functions as a strategy, this is a good example of why we want to start small and work our way up.
Now that you’ve seen how to use a bridge function to wrap a function that uses global state, let’s look at how we can use an adapter function to do the same job:
subroutine my_interesting_routine_p
inout param1, d10
in param2, d10
out result, n
common
fld1, d10
fld2, d10
proc
fld1 = param1
fld2 = param2
xcall my_interesting_routine()
param1 = fld1
xreturn
endsubroutine
This version is functionally equivalent to the original, but now it takes parameters and uses them to pass in and out the global state. If you need to start big and a little dirtier, using adapter functions is a good way to go. You can wrap the entire function and then start breaking it down into smaller pieces. This is a good strategy if you’re trying to get a large codebase under test quickly. You will of course need to watch out for any unknown side effects for routines called by the function you’re wrapping. In our example, we had already learned that fld1
was being modified by another routine, so we knew we needed to pass it in and out of our wrapper function. If we hadn’t known that, we would have had to discover it by testing the function and then refactoring it to use parameters.
Dependency injection (.NET)
Dependency injection (DI) is a design pattern that changes how dependencies are managed within your codebase. It’s a technique where dependencies (such as services, objects, or functions) are “injected” into a component (like a class or function) from the outside, rather than being created inside the component.
Without DI, components often create instances of their own dependencies. This tight coupling makes it hard to modify or test individual components, as changes in one dependency can ripple through the entire system. Moreover, testing such components in isolation becomes challenging, as they rely on the actual implementation of their dependencies.
Dependency injection addresses these issues by decoupling components from their dependencies. Instead of a component initializing its dependencies, the dependencies are provided to it, often through constructors, setters, or specific DI frameworks. This decoupling means components don’t need to know where their dependencies come from or how they are implemented. They just need to know that the dependencies adhere to a certain interface or contract.
The impact on testability is substantial:
Easier unit testing: With DI, you can easily provide mock implementations of dependencies when unit testing a component. This allows for testing the component in isolation, without worrying about the intricacies of its dependencies.
Reduced test complexity: Since dependencies can be replaced with simpler, controlled versions (like stubs or mocks), the complexity of setting up test environments is significantly reduced. This simplifies writing, understanding, and maintaining tests.
Increased code reusability: DI encourages writing more modular code. Modules or components that are designed to be independent from their dependencies are inherently more reusable in different contexts.
Flexibility and scalability: Changing or upgrading dependencies becomes easier and safer. Since components are not tightly bound to specific implementations, swapping or modifying dependencies has minimal impact on the components themselves.
Design for testability: DI fosters a design mindset where developers think about testability from the outset. Designing components with DI in mind leads to cleaner interfaces and more focused component responsibilities.
Later on in this chapter, we’ll discuss using fakes for .NET and Traditional DBL, as an alternative to injecting mocks with DI.
Unit Testing
DBL developers have two primary options for unit testing: the Synergex Test Framework for Traditional DBL and MSTest for .NET. The Synergex Test Framework attempts to mimic the syntax of MSTest as closely as possible. Both frameworks use attributes to mark classes and methods as tests, and both use the Assert
class to verify test results. The main visible difference is that MSTest uses the Microsoft.VisualStudio.TestTools.UnitTesting
namespace, and Traditional DBL uses the Synergex.TestFramework
namespace. We’re going to go over the tools you have available to you in both frameworks, and then we’ll walk through creating and running some tests.
Attributes
TestClass: This attribute is used to denote a class that contains unit test methods. In both DBL and MSTest, any class marked with this attribute is recognized as a container for test methods. This is foundational for organizing test scripts, where each test class typically focuses on a specific area of functionality.
TestMethod: Applied to methods within a TestClass, this attribute identifies individual tests. Each method marked as a TestMethod represents a distinct test case. This is where the actual testing logic resides, with each method typically testing a single functionality or scenario.
TestInitialize: This attribute marks a method that is run before each test in the test class. It’s used for setup operations that need to be repeated before every test, such as initializing objects, setting default values, or configuring the test environment. This ensures that each test starts from a consistent state.
TestCleanup: The counterpart to TestInitialize, methods with this attribute are executed after each test in the class. This is where you clean up or dispose of resources used during the test, ensuring that one test’s side effects don’t affect another. It’s essential for maintaining isolation between tests.
ClassInitialize: Used for setup that needs to run once before any of the tests in the class are executed, this attribute is typically for more resource-intensive operations. This could involve setting up database connections, preparing external resources, or establishing other once-per-class setup tasks.
ClassCleanup: The companion to ClassInitialize, this attribute is used for teardown operations that should occur after all tests in a class have been run. It’s used to release resources that were set up in the ClassInitialize method, such as closing database connections or cleaning up files.
Ignore: This attribute is applied to tests that should be skipped. It’s useful for temporarily disabling a test, perhaps because it’s failing due to an external dependency or if it’s not relevant under certain conditions. The Ignore attribute allows for flexibility in test execution without removing the test code.
Asserts
Several static methods for verifying test results are available to you in Synergex.TestFramework.Assert
and Microsoft.VisualStudio.TestTools.UnitTesting.Assert
. We’re going to go over the basics, but if you want to dig into the exact signatures, you can check Synergex Test Framework documentation and the MSTest documentation.
AreEqual and AreNotEqual: These methods are used to verify that two values are equal or not equal, respectively. They’re typically used to compare the actual result of a test to the expected result. The first parameter is the expected value, and the second parameter is the actual value. If the values are equal, the test passes. If they’re not equal, the test fails and the test runner displays the expected and actual values along with a message if one is provided.
AreSame and AreNotSame: These methods are used to verify that two objects are the same or not the same, respectively. They’re typically used to compare the actual result of a test to the expected result. The first parameter is the expected object, and the second parameter is the actual object. This comparison is much less common, and AreEqual
is usually preferred to AreSame
. If the objects are the same instance, the test passes. If they’re not exactly the same instance, the test fails and the test runner displays the expected and actual objects along with a message if one is provided.
Fail: This method is used to force a test to fail. It’s typically used to indicate that you’ve reached a code branch that should never be reached.
Inconclusive: This method is used to indicate that a test is inconclusive. It’s typically used to indicate that a test is not applicable under certain conditions.
IsFalse and IsTrue: These methods are used to verify that a condition (typically Boolean) is false or true, respectively. The first parameter is the condition to test, and the second parameter is a message to display if the test fails.
IsNull and IsNotNull: These methods are used to verify that an object is null or not null, respectively. The first parameter is the object to test, and the second parameter is a message to display if the test fails.
Traditional DBL
Initial project setup
dotnet new synTradUnitTestProj -n MyUnitTestProject
Running tests
Once you’ve build the project, you can run the test by opening a Visual Studio developer command prompt, moving to the output directory for the project, and then running the following command:
vstest.console.exe MyUnitTestProject.elb
.NET
Initial project setup
dotnet new synNETUnitTest -n MyUnitTestProject
import System
import Microsoft.VisualStudio.TestTools.UnitTesting
namespace MyUnitTestProject
{TestClass}
public class Test1
{TestMethod}
public method TestMethod1, void
proc
Assert.AreEqual(1, 1)
Assert.AreNotEqual(1, 2)
if(1 != 1)
Assert.Fail("This should never happen")
endmethod
endclass
endnamespace
Running tests
dotnet test
Starting test execution, please wait...
A total of 1 test files matched the specified pattern.
Passed! - Failed: 0, Passed: 1, Skipped: 0, Total: 1, Duration: 49 ms - MyUnitTestProject.dll (net6.0)
Test-Driven Development in the Brownfield
TODO: Walk through building fakes/mocks/stubs for your unit tests in a legacy system in Traditional only, rely on the linker overwriting the original routine with the shim .NET can use MS Fakes to do the same thing or MIT license https://github.com/tonerdo/pose
Object-Oriented Features
DBL supports both structured and object-oriented programming styles. Object-oriented programming (OOP) offers a way to organize data based on the concept of objects, which are the fundamental components of programs. OOP enables developers to model entities and relationships in a structured, intuitive way. This chapter explains how taking advantage of OOP concepts will help you to design maintainable, reusable, and extensible code.
Classes and objects are the basic elements of OOP. A class is like a blueprint that specifies the attributes and behaviors (aka methods) common to a specific type of object. An object is member of a class, representing a specific instance with defined properties and functionality.
One pillar of OOP is inheritance, which allows a class (called a derived class or a subclass) to acquire attributes and methods from another class (called a base class or superclass). Inheritance encourages code reuse and creates a natural hierarchy between general and specific concepts, making code more modular and easier to maintain.
Interfaces and delegates are more advanced constructs of OOP that improve code flexibility and reusability. An interface is like a contract that a class must comply with. It defines the properties, methods, and events that a class must use, which guarantees consistent behavior across different implementations. A delegate provides a way to pass methods as parameters, enabling event-driven programming and more dynamic interaction between components.
Classes: The Building Blocks of Object-Oriented Programming
In object-oriented programming (OOP), classes serve as the foundational building blocks. They encapsulate data and methods that operate on that data into a single unit. When you hear the word “encapsulation,” think of it as a protective shield that prevents external code from accidentally modifying the internal state of an object. This shield ensures data integrity and allows for modular and maintainable code.
Defining classes
[access] [modifiers] CLASS name [EXTENDS class] [IMPLEMENTS interface, ...]
& [INHERITS [class,] [interface[, ...]]]
member_def
.
.
.
ENDCLASS
The syntax for declaring a class is pretty similar to that of a structure, so this should already look pretty familiar. EXTENDS
, IMPLEMENTS
, and the catch-all INHERITS
all have to do with specifying a base type or interface to be implemented by a class. EXTENDS
is specifically for a base class. IMPLEMENTS
allows a list of interfaces to be implemented. INHERITS
can take any combination of a base class or list of interfaces and is a newer syntax option, because it turns out that separating classes and interfaces syntactically is a real pain.
Members
A class is typically comprised of three main components: fields, properties, and methods. Properties are variables that hold the state of the object. For example, in a Person
class, Name
and Age
could be properties. Methods, on the other hand, are functions defined within the class that act on or use these properties. Continuing with our Person
analogy, CelebrateBirthday()
might be a method that increases the Age
property by one.
Common class member modifiers
Methods and properties both share many of the same concepts because under the hood, properties are syntactic sugar that just make a certain type of method more friendly to call and understand. Modifiers: A method or property can have several modifiers. Some common ones include
- STATIC: Can be accessed without referencing an object.
- ABSTRACT: Must be implemented in a derived class.
- OVERRIDE: Overrides a parent class’s method.
- VIRTUAL: Can be overridden in a subclass.
- CONST (field): A static field whose value is not changed once it is initialized.
CONST
impliesSTATIC
. - READONLY (field): A field’s value can only be set during declaration or in the constructor of the class.
- And more, such as
ASYNC
(method),BYREF
,EXTENSION
(method),NEW
,SEALED
, etc.
Visibility or access modifiers determine the accessibility of a class, method, property, or field. They define the scope from which these members can be accessed. The main purpose of visibility modifiers is to encapsulate the data, ensuring that it remains consistent and safe from unintentional modifications.
Here’s a brief overview of common visibility modifiers:
- PUBLIC:
- When a class or its member is declared as public, it means it can be accessed from any other class or method, regardless of where it is in the program or the package.
- It offers the most unrestricted level of accessibility.
- PRIVATE:
- A private class member can only be accessed from within the same class in which it is declared.
- It’s a way to hide the internal details of a class, exposing only what’s necessary and keeping the rest hidden and safe from external interference.
- PROTECTED:
- Protected members are accessible from within the same class as well as in subclasses.
- They cannot be accessed outside of these contexts unless they’re in the same package. (This behavior can differ in languages like Java.)
- INTERNAL (.NET):
- An internal member can be accessed by any code in the same assembly, but not outside of it.
- Internal members are particularly useful when you want to make something accessible to all classes within a specific assembly but hidden from outside consumers.
- PROTECTED INTERNAL (.NET) (specific to some languages like C#):
- This modifier is a combination of both protected and internal. Members with this modifier are accessible from the current assembly and from derived classes.
- It provides a more nuanced level of accessibility.
Importance of visibility modifiers:
-
Encapsulation: One of the core tenets of OOP is encapsulation, which means bundling data and methods that operate on that data into a single unit and restricting the direct access to some of the object’s components. Visibility modifiers help achieve encapsulation by controlling the accessibility of class members.
-
Maintainability: By controlling the access to class members, you can prevent unintended interactions or modifications. This makes the system easier to maintain and reduces the risk of bugs.
-
Flexibility: Over time, the internal implementation of a class may need to change. If you’ve restricted access to internal details using private or protected visibility, you can change details without affecting classes that depend on the one you’re modifying.
-
Data integrity: Ensuring that data is accessed and modified in controlled ways can prevent the system from entering into an inconsistent or invalid state.
Defining methods
[access] [modifiers ...] METHOD name, return_type[, options]
parameter_def
PROC
.
.
ENDMETHOD|END
Key components
-
Name: The method’s identifier.
-
Return type: Specifies what kind of data the method will send back when called.
-
Parameters (parameter_def): Values you can pass into a method to be used within its execution.
-
ROUND or TRUNCATE options: By default, a program rounds all expression results. However, specifying the
ROUND
orTRUNCATE
option on a METHOD can override the default.
Defining properties
[access] [modifiers ...] PROPERTY name, property_type
method_def
...
ENDPROPERTY
Properties are unique members of a class that provide a flexible mechanism to read, write, or compute the value of a private field. Properties can be used as if they are data members, but they are actually special methods called accessors. The method_def
part of the above syntax description is mostly like a defining any other method, with the main differences being that you don’t have to declare parameters or a return type.
Auto properties
[access] [modifier ...] simple_mod PROPERTY name, property_type [,initial_value]
Auto properties or simple properties save you the tedium of having to write the accessor code if you just want to define a field and expose it with some control, like only allowing code inside the class to set a value but allowing anyone to get a value (SETPRIVATE
).
simple_mod
: This property determines the behavior of the generated accessors. Options includeINITONLY
,READONLY
,READWRITE
,SETPROTECTED
,SETPRIVATE
,SETINTERNAL
, andSETPROTECTEDINTERNAL
.
Why use properties?
Properties offer a way to control access to the data in your objects. You can define what actions can be taken with that data, which can be essential for maintaining the integrity of your objects. For instance, properties can prevent inconsistent modifications, validate new data before storing it, or perform specific actions when data is accessed.
Properties vs. fields
Fields are variables that hold the state of an object. When you want to provide controlled access or take certain actions when a field is read or written, you can use properties. A property will typically have a backing field that is a private member variable of the property’s type. When the property gets accessed or modified, the corresponding getter or setter will be invoked, allowing the developer to add custom logic.
Defining fields
[access] [modifier] [name], [dimension|[dim, …]]type[size][, init_value, …]
Most of the complexity seen in field definitions is about dealing with the fixed-size types like a
, d
, and i
and fixed-size arrays of those types. Here’s a slightly less comprehensive look at defining a class field.
[access] [modifier] [name], type[, init_value]
Other than access modifiers, class field declarations are largely the same as what you’ve seen inside records, groups, or structures, and in fact, you can declare records, groups, and structures within a class as you would in the data division of a routine.
Inheritance and polymorphism
One of the powerful features of classes is the ability to create hierarchies through inheritance. A class can inherit properties and methods from another class, referred to as its “base” or “parent” class. This enables the creation of more specific subclasses from general parent classes. For instance, from a general Vehicle
class, one could derive more specific classes like Car
or Bike
.
Polymorphism, closely tied to inheritance, allows objects of different classes to be treated as objects of a common superclass. This flexibility means that different classes can define their own unique implementations of methods, yet they can be invoked through a reference of the parent class, enhancing code reusability and flexibility.
Both of these topics will be further covered in the later sections about inheritance and objects.
Constructors and destructors
Classes often come with special methods called constructors and destructors. A constructor is automatically invoked when an object of the class is instantiated, and it typically initializes the object’s properties. Destructors, on the other hand, are called when the object is about to be destroyed, providing an opportunity to release resources or perform cleanup operations.
Overloaded methods
Method overloading is a feature that allows a class to have multiple methods with the same name but with different parameters. Consider the Console.WriteLine
method in .NET, a staple for output in this book. In .NET this method is overloaded to accept a variety of data types, from strings to integers to custom objects. Whether you’re passing a string, an integer, or even a decimal, you use the same method name, Console.WriteLine
. The specific version of Console.WriteLine
that gets executed depends on the type and number of arguments you provide. For example, Console.WriteLine("Hello, World!")
will use the string overload, while Console.WriteLine(123)
will use the integer overload. This allows for a consistent method name, promoting readability, while catering to various data needs. By leveraging method overloading, developers can maintain a streamlined method-naming convention, making code both more intuitive and readable.
Overloaded methods don’t require any specific syntax; they are just multiple method declarations with the same name but a different number of parameters or different parameter types.
Example
namespace ComplexTypesExample
public class Rectangle
;; Fields
private length_fld, d18.10
private width_fld, d18.10
;; Properties
public property Length, id
method get
proc
mreturn length_fld
endmethod
method set
proc
length_fld = value > 0 ? value : 0
endmethod
endproperty
public property Width, id
method get
proc
mreturn width_fld
endmethod
method set
proc
width_fld = value > 0 ? value : 0
endmethod
endproperty
public property Area, id
method get
proc
mreturn length_fld * width_fld
endmethod
endproperty
;; Constructor overloading
public method Rectangle
endparams
this(1.0, 1.0)
proc
endmethod
public method Rectangle
side, id
endparams
this(side, side)
proc
endmethod
public method Rectangle
aLength, id
aWidth, id
endparams
proc
Length = aLength
Width = aWidth
endmethod
;; Method overloading
public method SetDimensions, void
side, id
endparams
proc
SetDimensions(side, side)
endmethod
public method SetDimensions, void
aLength, id
aWidth, id
endparams
proc
Length = aLength
Width = aWidth
endmethod
public method Display, void
endparams
proc
Console.WriteLine("Rectangle Dimensions: " +%string(Length) + " x " + %string(Width))
Console.WriteLine("Area: " + %string(Area))
endmethod
endclass
endnamespace
main
proc
begin
;; Using default constructor
data square = new Rectangle()
;; Using constructor with one parameter
data anotherSquare = new Rectangle(5)
;; Using constructor with two parameters
data rect = new Rectangle(4, 6)
square.Display()
anotherSquare.Display()
rect.Display()
;; Using method overloading
rect.SetDimensions(7)
rect.Display()
rect.SetDimensions(7, 8)
rect.Display()
end
endmain
Output
Rectangle Dimensions: 1.0000000000000000000000000000 x 1.0000000000000000000000000000 Area: 1.0000000000000000000000000000 Rectangle Dimensions: 5.0000000000000000000000000000 x 5.0000000000000000000000000000 Area: 25.0000000000000000000000000000 Rectangle Dimensions: 4.0000000000000000000000000000 x 6.0000000000000000000000000000 Area: 24.0000000000000000000000000000 Rectangle Dimensions: 7.0000000000000000000000000000 x 7.0000000000000000000000000000 Area: 49.0000000000000000000000000000 Rectangle Dimensions: 7.0000000000000000000000000000 x 8.0000000000000000000000000000 Area: 56.0000000000000000000000000000
Objects
Object handle syntax
When specifying a type, you mostly need to use the @
symbol to indicate that the type is an object. For example, to declare a variable of type System.Object
, you would use the following syntax:
data obj, @System.Object
The @
can be ignored when declaring a variable of type string
or System.String
and also in some cases related to generic types, but it is otherwise required. For example, the following code is valid:
data str, string
Casting
Casting allows developers to convert one data type into another using the syntax (typename)variable
. This explicit cast is essential when the compiler cannot automatically determine the type conversion or when you intend to treat an instance of one type as another. While this casting method is direct, it can throw an exception if the cast is invalid. To enhance type-checking capabilities, DBL provides the is
and as
operators. The is
operator checks if an object is of a specific type, returning a Boolean result. For example, if (obj .is. SomeType)
would check if obj
is of type SomeType
. On the other hand, ^AS performs a safe cast. It attempts to cast an object to a specified type, and, if the cast isn’t possible, it returns ^null
instead of throwing an exception. For instance, data result = ^as(obj, @SomeType)
would attempt to cast obj
to SomeType
, assigning the result to result
or ^null
if the cast fails. ^AS is only available when compiling for .NET.
Object lifetime
When using Traditional DBL, the lifetime of an object is controlled by keeping a list of all the live roots. This is sort of like reference counting, but it doesn’t have a problem with circular references. Destructors will be called immediately when the last reference to an object goes out of scope. This is different from .NET, where the garbage collector will run at some point in the future and clean up objects that are no longer referenced. Keep this in mind when dealing with objects that have destructors. In .NET, developers will often use the Dispose pattern to clean up resources. When compiling for .NET, DBL offers some syntax support for resource clean-up in the form of the DISPOSABLE
modifier on a DATA statement. It works similarly to using
in C# and will call the Dispose
method on the object when it goes out of scope. For example:
disposable data obj, @MyDisposableObject
Boxing and unboxing
The process of converting value types and descriptor types into objects, and vice versa, is referred to as “boxing” and “unboxing.” These actions are seamlessly handled by the compiler and runtime.
Understanding boxing
Boxing essentially wraps a value type, like an integer or structure, into an object. Declaring a specific-typed box can be done by prepending the typename with @
, like @i
or @my_struct
. For boxing, you can employ either the specific cast syntax (@type)
or a generic (object)
. Depending on the method chosen, the behavior can differ:
- Using
(@type)
cast:obj = (@d)dvar
Remember, when dealing with numeric literals, you must specify if it’s an integer or implied-decimal with (@i4)
or (@id)
respectively.
- Using
(object)
cast:obj = (object)mystructvar
In DBL, boxing takes place automatically under several scenarios, such as
- Passing a literal or value type to a
System.Object
parameter. - Assigning a value type to a
System.Object
variable. - Assigning a literal to a
System.Object
field or property.
When a value gets boxed, it becomes type @System.Object
. This limits its accessibility to only the members of System.Object
.
Understanding unboxing
Unboxing, the counterpart to boxing, extracts the original value from the object. Usually, this involves explicitly casting the object back to its original type. In some scenarios, DBL allows for automatic unboxing, such as when
- Passing a boxed type to a matching unboxed type parameter.
- Assigning a boxed type to a variable with a matching type.
- Accessing a field of a typed boxed structure.
Using (type)
cast:
dval = (d)obj
Inheritance
One of the earliest object-oriented programming concepts taught is that of inheritance and polymorphism. At its core, this concept teaches that a derived class can inherit properties and behaviors from a base class and can override or extend these behaviors. This foundational concept is pivotal in building scalable and maintainable software. A common example is different shapes (Circle
, Rectangle
, etc.) all inheriting from a base Shape
class. But we’re going to try something just the slightest bit less contrived that you might actually be able to use. As developers delve deeper into OOP and software design patterns, the utility of polymorphism begins to shine in more subtle, advanced, and powerful ways. One of these is the delegation pattern, which can be thought of as an evolution or specialized use of polymorphism.
The delegation pattern via base class
In the delegation pattern, instead of performing a task itself, an object delegates the task to a helper object. This can be achieved elegantly using polymorphism by defining an interface or a base class and then creating delegate classes that implement the specific behavior.
Consider an application that needs to notify users of certain events. Multiple notification methods exist: email, SMS, push notification, etc. In the future, there could be additional notification methods, and you don’t really want to have a super method that handles every possible kind of notification. So you do this:
import System.Collections
namespace DelegatePattern
abstract class Notifier
public abstract method Notify, void
user, @string
message, @string
proc
endmethod
endclass
class EmailNotifier extends Notifier
public override method Notify, void
user, @string
message, @string
proc
;; Logic to send an email
endmethod
endclass
class SMSNotifier extends Notifier
public override method Notify, void
user, @string
message, @string
proc
;; Logic to send an SMS
endmethod
endclass
class PushNotifier extends Notifier
public override method Notify, void
user, @string
message, @string
proc
;; Logic to send a push notification
endmethod
endclass
class EventManager
private notifierInstances, @ArrayList
public method EventManager
proc
notifierInstances = new ArrayList()
endmethod
public method AddHandler, void
aNotifier, @Notifier
proc
notifierInstances.Add(aNotifier)
endmethod
public method AlertUser, void
user, @string
message, @string
proc
foreach data instance in notifierInstances as @Notifier
instance.Notify(user, message)
endmethod
endclass
endnamespace
In this design, the EventManager
doesn’t need to know how the user is notified—it just knows it has a method to do so. We’ve abstracted the notification mechanism and encapsulated it within specific “delegate” classes (EmailNotifier, SMSNotifier, PushNotifier). After creating an instance of EventManager
, the specific notifier is added to the internal array list. This provides a high degree of flexibility.
- Flexibility: Easily add new notification methods without changing the
EventManager
code. - Single responsibility: Each class has a distinct responsibility. The notifiers just notify, and the
EventManager
manages events. - Decoupling: The
EventManager
is decoupled from the specifics of notification, making the system more modular and easier to maintain.
In this example, the power of polymorphism is not just in categorizing similar objects but in defining behaviors that can be swapped out or combined as needed. It’s a step toward more advanced design patterns and showcases the depth and versatility of OOP principles in real-world scenarios.
Overriding virtual functions in concrete base classes
In many scenarios, the base class serves as a default implementation that’s appropriate for most cases. However, certain specialized situations may require variations or enhancements to this default behavior. This is where the power of overriding in concrete (non-abstract) base classes shines.
For instance, consider a WebServiceClient class that already provides a concrete implementation of fetching data. This is a little bit contrived, but if you suspend your disbelief for a moment and imagine you want to bake in some extra HTTP headers, you might implement that by extending the base client, overriding the part that does the HTTP request, add in your headers, and then let the base type implementation finish the job. It would look something like this example code:
namespace WrappingBaseFunctionality
public class WebServiceClient
public method ExampleSize, int
proc
mreturn FetchData(^null).Length
endmethod
protected virtual method FetchData, @string
requestHeaders, [#]string
record
status, int
errtxt, string
responseHeaders, [#]string
responseBody, string
endrecord
proc
status = %http_get("http://example.com/",5,responseBody,errtxt,requestHeaders,responseHeaders,,,,,,,"1.1")
mreturn responseBody
endmethod
endclass
public class FancyWebServiceClient extends WebServiceClient
protected override method FetchData, @string
requestHeaders, [#]string
record
status, int
errtxt, string
localRequestHeaders, [#]string
responseHeaders, [#]string
responseBody, string
endrecord
proc
localRequestHeaders = new string[requestHeaders != ^null ? requestHeaders.Length : 1]
localRequestHeaders[localRequestHeaders.Length - 1] = "X-SOME-AUTH-HEADER: value"
;;now that we've added our new header, we can call the base class implementation
mreturn parent.FetchData(localRequestHeaders)
endmethod
endclass
endnamespace
Interfaces
Interfaces play the role of defining a blueprint for methods, properties, and events. An interface is akin to a contract; it specifies a set of operations that a class or a structure must implement, without providing the implementation details itself. It’s a way to enforce certain behaviors and functionalities across different classes, ensuring they adhere to a specific protocol. Interfaces are only supported when targeting .NET.
What are interfaces?
An interface is a type definition, similar to a class, but it purely represents a contract or a template. It defines what is to be done but not how it is to be done. Interfaces contain method signatures, properties, and events, but these members lack any implementation. When a class or a structure implements an interface, it agrees to fulfill this contract by providing concrete implementations for the interface’s members.
Characteristics of interfaces
- Abstract members: Interfaces historically only declare members but do not implement them. This is changing with the introduction of default interface implementations in .NET.
- Multiple inheritance: A class or structure can implement multiple interfaces, thus supporting multiple inheritance, which isn’t directly possible with base classes.
- Polymorphism: Interfaces enable polymorphic behavior. A class instance can be treated as an instance of an interface if it implements that interface.
Key uses of interfaces in .NET:
-
Decoupling code: Interfaces help in decoupling the implementation from the interface, allowing changes in implementation without affecting the interface consumers.
-
Testing and mocking: Interfaces facilitate easier unit testing and mocking. By programming to an interface, you can replace implementation with mock objects in testing scenarios.
-
Design flexibility: Interfaces allow developers to define functionalities that can be adopted by various unrelated classes, providing flexibility in design.
-
Contract-based development: Interfaces ensure that a class adheres to a specific contract, making it easier to understand and use.
-
API design: In API development, interfaces are used to define contracts that external systems can implement and interact with.
-
Extensibility: Interfaces provide a way to extend a system with new features without breaking existing functionality.
Syntax for declaring interfaces in DBL
The syntax for declaring an interface in DBL is very similar to declaring a class. Just like a class, interfaces must be declared within a namespace. Here’s the basic structure:
namespace Example
interface InterfaceName
; Interface members go here
endinterface
endnamespace
In this structure,
interface
is the keyword used to begin the declaration.InterfaceName
is the name of the interface, which should follow the naming conventions.- The interface body contains declarations of methods, properties, and events without implementations.
endinterface
marks the end of the interface declaration.
Guidelines for naming interfaces
When naming interfaces in .NET, certain conventions are generally followed to ensure clarity and consistency:
-
PascalCase naming: Like classes and methods, interface names should use PascalCase (e.g.,
IReadable
,IMyInterface
). -
‘I’ Prefix: Starting interface names with an uppercase “I” to distinguish them from classes and other types is a common practice in .NET. This “I” denotes “Interface.”
-
Descriptive names: The name should clearly describe the behavior or capability that the interface represents, for example,
IDisposable
for disposability andIEnumerable
for enumeration. -
Keep it short and intuitive: While being descriptive, the name should not be excessively long. It should be easily memorable and intuitive for other developers.
Simple interface declaration
Here’s an example of a simple interface declaration:
namespace Example
interface IShape
method Draw, void
someparameter, string
endmethod
method GetArea, double
endmethod
readonly property Color, string
endinterface
endnamespace
In this example, IShape
is an interface in the Example
namespace with two methods, Draw
and GetArea
, and one property, Color
. the Draw
method takes a single parameter of type string
. This interface can be implemented by any class that represents a shape, and it enforces the implementation of the Draw
and GetArea
methods and the Color
property in those classes.
Implementing interfaces
To implement an interface, a class must provide concrete definitions for all the members declared in the interface. A class uses the implements
keyword to specify the interface it is implementing. A class can implement multiple interfaces, separating each with a comma.
Explicit vs. implicit implementation
Implicit implementation: Implicit implementation means the method in the class has the same signature as the method in the interface. It is the most common implementation and is straightforward. The class methods naturally match the interface contract without any special syntax.
Explicit implementation: In explicit implementation, the interface name is prefixed to the method name. This is particularly useful when a class implements multiple interfaces that may have methods with the same signature but require different implementations. In DBL, explicit interface members are accessible only through a variable of the interface type.
Examples
Simple interface implementation:
namespace Example
public interface IVehicle
method Drive, void
endmethod
endinterface
public class Car implements IVehicle
public method Drive, void
proc
Console.WriteLine("Car is driving")
endmethod
endclass
endnamespace
Here, Car
implicitly implements the IVehicle
interface.
Explicit interface implementation:
namespace Example
public class MultiFunctionDevice implements IPrinter, IScanner
method IPrinter.Print, void
proc
;; Printer-specific implementation
endmethod
method IScanner.Scan, void
proc
;; Scanner-specific implementation
endmethod
endclass
endnamespace
Here, MultiFunctionDevice
implements both IPrinter
and IScanner
interfaces. Since both interfaces have a Print
method, we need to explicitly specify which interface’s Print
method we’re implementing. Calling code will need to use the interface type to access the explicit implementation.
Implementing interfaces with default implementations (for .NET 6+):
namespace Example
public interface IVehicle
method Drive, void
proc
Console.WriteLine("Vehicle is driving")
endmethod
method Refuel, void
endmethod
endinterface
public class ElectricCar implements IVehicle
;; Implements Refuel only; uses default Drive implementation
public method Refuel, void
proc
Console.WriteLine("Charging battery")
endmethod
endclass
endnamespace
ElectricCar
implements Refuel
and inherits the default Drive
method from IVehicle
.
Interfaces vs. abstract classes
-
Definition and capabilities:
- Interfaces: When targeting .NET 6+, interfaces can include default implementations for methods. This allows interfaces to define not just the contract (method signatures) but also provide a base implementation. However, they still cannot hold state (fields).
- Abstract classes: Abstract classes can provide both complete (implemented) and incomplete (abstract) methods. They can also contain fields, constructors, and other implementation details, offering a more comprehensive template for derived classes.
-
Inheritance and flexibility:
- Interfaces: A key advantage of interfaces is the ability to implement multiple interfaces in a single class, enabling a form of multiple inheritance and greater flexibility in combining different behaviors.
- Abstract classes: Classes can only inherit from one abstract class, enforcing a more traditional, linear inheritance hierarchy. This is useful for a clear and structured base but is less flexible than interfaces.
-
Member types and state management:
- Interfaces: Even with default implementations, interfaces cannot maintain state through fields. All members are inherently abstract or virtual.
- Abstract classes: Classes can have a mix of abstract and non-abstract members, including fields, allowing them to maintain state and provide more comprehensive functionality.
-
Access modifiers and member visibility:
- Interfaces: Prior to .NET 6, interface members were always public. Interfaces now support more complex access patterns like private and protected members.
- Abstract classes: Classes offer full flexibility with access modifiers, allowing public, protected, and private member visibility.
When to use an interface over an abstract class
-
Role or capability modeling:
- Choose interfaces when defining a set of capabilities or roles that classes can adopt, especially when these capabilities can be combined or are independent of the class hierarchy.
-
Enhanced flexibility with default implementations:
- With default implementations, interfaces now offer a mix of defined behaviors and abstract declarations. Use interfaces when you want to provide a default behavior that classes can override or extend.
-
Multiple inheritance:
- If a class needs to incorporate functionality from multiple sources, interfaces remain the best choice due to their ability to support a form of multiple inheritance.
-
Decoupling and future evolution:
- Interfaces are better for decoupling where you want to separate operation definitions from the class hierarchy. They are also preferable when expecting future expansion or changes in the contract, as adding new methods with default implementations doesn’t break existing implementations.
When to prefer abstract classes
-
Shared base functionality with state:
- Use abstract classes when there’s a need for a shared base that includes not just behavior (methods) but also state (fields and properties).
-
Controlled inheritance:
- When a strict and controlled inheritance structure is required, with a clear common base for all subclasses, an abstract class is more appropriate.
-
Comprehensive:
- Abstract classes are ideal when you need to provide a more comprehensive base, including constructors, fields, and a mix of implemented and abstract methods.
Practical use cases for interfaces in DBL
Interfaces in DBL are powerful tools for creating flexible, maintainable, and scalable software architectures. They are especially beneficial in scenarios where abstraction, decoupling, and multiple inheritance of behaviors are required. Here are some common scenarios and design patterns where interfaces are particularly useful:
1. Decoupling modules and layers
Interfaces are instrumental in decoupling different parts of an application. By programming to an interface rather than a concrete implementation, you can change the underlying implementation without affecting the clients of the interface. This is particularly useful in the following:
- Service layers: For example, in a multi-layered architecture, the service layer can expose interfaces, allowing the presentation layer to interact with the service layer without knowing the implementation details.
- Repository patterns: In database operations, interfaces can abstract the data access layer, allowing for easier swapping of database technologies or mocking of data access for testing.
2. Dependency injection and inversion of control
In modern software design, especially in test-driven development (TDD), interfaces are crucial for dependency injection (DI) and inversion of control (IoC). Interfaces define contracts for services or components, and concrete implementations are injected at runtime. This approach simplifies unit testing and reduces coupling.
3. Strategy pattern
The strategy pattern is a behavioral design pattern that enables selecting an algorithm’s runtime implementation. By defining a family of algorithms as an interface and making each algorithm a separate class that implements this interface, you can switch between different algorithms dynamically.
Implementing the strategy pattern using interfaces
Let’s consider a practical example: a sorting application where the sorting algorithm can vary.
Define the strategy interface Create an interface for the sorting strategy.
namespace Example
public interface ISortStrategy
method Sort, void
arg, @List<int>
endmethod
endinterface
endnamespace
Implement concrete strategies Create different sorting algorithms implementing the interface.
namespace Example
public class QuickSort implements ISortStrategy
public method Sort, void
arg, @List<int>
proc
;; QuickSort implementation
endmethod
endclass
public class MergeSort implements ISortStrategy
public method Sort, void
arg, @List<int>
proc
// MergeSort implementation
endmethod
endclass
endnamespace
Context class A context class uses the sorting strategy.
namespace Example
public class SortContext
private strategy, @ISortStrategy
public constructor SortContext, void
strategy, @ISortStrategy
proc
this.strategy = strategy
end
public method SetStrategy, void
strategy, @ISortStrategy
proc
this.strategy = strategy
endmethod
public method SortData, void
data, @List of i4
proc
strategy.Sort(data)
endmethod
endclass
endnamespace
Using the strategy
The client can now use SortContext
with different strategies.
main
proc
data myList = new List<int>() {5, 6, 7, 1, 2, 10, 88}
sortContext = new SortContext(new QuickSort())
sortContext.SortData(data)
;; Switch to a different strategy
sortContext.SetStrategy(new MergeSort())
sortContext.SortData(data)
end
In this example, the use of interfaces allows for flexibility and extensibility in the sorting algorithms. New sorting strategies can be added without modifying the context or client code, adhering to the open/closed principle, one of the SOLID principles of object-oriented design. This is a classic example of how interfaces promote better software design practices.
Best practices in designing and using interfaces
Define clear contracts
Interfaces should represent a clear contract. Define methods and properties that are cohesive and relevant to the interface’s purpose. Avoid adding unrelated methods that can lead to a violation of the interface segregation principle (another of the SOLID principles).
Use descriptive names
Choose names that clearly convey the interface’s purpose. For example, IPrintable
or IDataRepository
are more descriptive and intuitive than vague names like IData
or IProcess
.
Favor small, focused interfaces
Create small and focused interfaces rather than large, do-it-all interfaces. This increases the flexibility and reusability of your interfaces and adheres to the interface segregation principle.
Consider default implementations carefully
With the capability of default implementations in interfaces, use them judiciously. Default methods can simplify interface evolution, but overusing them can lead to confusion and difficulties in understanding the inheritance hierarchy.
Document interface expectations
Document what each method in the interface is expected to do, its parameters, return type, and any special behavior or requirements. This is crucial for others who implement the interface to understand its purpose and usage.
Use interfaces for decoupling
Leverage interfaces to decouple components of your application. This facilitates testing (especially unit testing), maintenance, and future enhancements.
Interface inheritance
Use interface inheritance to extend or modify the functionality of an interface. However, ensure that the new interface logically extends the old one and doesn’t just add unrelated methods.
Consider future changes
Design interfaces with future evolution in mind. Changing an interface after it’s widely used can be problematic. With default implementations, it’s easier to add new methods, but changing existing ones can still break compatibility.
Common pitfalls to avoid when working with interfaces
Over-engineering
Avoid creating interfaces for classes that don’t need them, especially if there’s only one implementation. This can lead to unnecessary complexity.
Ignoring interface segregation
Avoid creating large interfaces that force implementers to write methods they don’t need. This not only bloats the code but can also lead to errors and inefficient implementations.
Misusing default implementations
Be cautious when adding default implementations in interfaces. They should not be used as a workaround for multiple inheritance or to inject common functionality better suited for an abstract class.
Confusing interface with implementation
Avoid designing interfaces based on how they will be implemented. Interfaces should be defined based on what they represent or need to accomplish, not how they will achieve it.
Breaking changes
Avoid making changes to interfaces that will break existing implementations. Consider the impact of changes on all classes that implement the interface.
Lack of documentation
Make sure to document your interfaces. Clear documentation is vital for understanding the purpose and usage of an interface, especially in large and complex projects.
Delegates
TODO: Write this section
Banking on Basics: Exploring OOP with a Simple Bank Simulation
In this chapter, we’ll use the object-oriented features of DBL to build a simple bank simulation. We’re going to model fundamental banking operations, focusing on accounts and transfers, and interact with users through a straightforward command line interface. We aren’t going to cover file I/O in this chapter, so we’ll be using a simple in-memory collection to hold our data. This will allow us to focus on the object-oriented features of DBL without getting bogged down in the details of file I/O. Let’s get started by creating a new project. Although this is going to be a .NET project, we won’t use anything that isn’t available in a DBL project targeting the Traditional DBL runtime.
Let’s get started by creating a new solution and a .NET console app using dotnet new
:
mkdir BankExample
cd BankExample
dotnet new sln
dotnet new synNETApp -n BankApp
dotnet sln add BankApp\BankApp.synproj
We’re also going to make use of the Repository to define things like account data and transfer records. So let’s create a repository project, add it to our solution, and add a reference to it from our console app:
dotnet new synRepoProj -n Repository
dotnet sln add Repository\Repository.synproj
dotnet add BankApp\BankApp.synproj reference Repository\Repository.synproj
Now that we have out projects created and hooked up, let’s dive into designing our accounts in the next section.
Accounts
As we start to model our super simple bank simulation, we’re going to begin with the most basic building block, the account. We’re going to start by defining a repository structure for a very simple account that has a balance, an account number, and also a name so we can identify it. Go ahead and open Repository\repository.scm
and add the following structure definition:
STRUCTURE Account DBL ISAM
DESCRIPTION "Bank account information"
FIELD AccountId TYPE INTEGER SIZE 4
FIELD Name TYPE ALPHA SIZE 40
FIELD Balance TYPE DECIMAL SIZE 28 PRECISION 2
END
The field sizes here are pretty arbitrary. Later on in this chapter, we’ll walk through a more in-depth analysis of how to determine the appropriate sizes for some example fields.
The code we’re going to write needs to handle the following operations:
-
Management:
- Account creation: Users can create new bank accounts. Each account will have unique attributes like account number and balance.
- Account details: Users can view details of an account, including the account number and the current balance.
-
Transactions:
- Deposits: Users can deposit money into any account. This involves increasing the account balance by the deposit amount.
- Withdrawals: Users can withdraw money, provided they have sufficient balance. This decreases the account balance.
- Transfers: Users can transfer money from one account to another. This involves withdrawing money from one account and depositing it into another.
-
Reporting:
- Balance inquiry: Users can check the balance of any account.
- Total assets: The application can calculate and display the total assets held across all accounts.
Most of these operations are going to be handled by the Account
class. Let’s go ahead and create a new class in BankApp\Account.dbl
:
namespace BankApp
.include "Account" REPOSITORY, structure="AccountData", end
class Account
private innerData, AccountData
;; Constructor for Account
public method Account
accountId, int
name, string
proc
innerData.AccountId = accountId
innerData.Balance = 0.0
innerData.Name = name
endmethod
;; Method to deposit money
public method Deposit, void
amount, d.
proc
if (amount > 0)
begin
innerData.Balance += amount
end
endmethod
;; Method to withdraw money
public method Withdraw, boolean
amount, d.
proc
if (amount > 0 && innerData.Balance >= amount) then
begin
innerData.Balance -= amount
mreturn true
end
else
mreturn false
endmethod
public property Balance, d28.2
method get
proc
mreturn innerData.Balance
endmethod
endproperty
public property AccountId, int
method get
proc
mreturn innerData.AccountId
endmethod
endproperty
endclass
endnamespace
Now we’ve hidden the details of the account data structure behind the Account
class and its accessors. This is a good start, but we’re going to need to be able to manage the accounts. We are going to be using ArrayList
as our primary data storage mechanism with the account IDs being the index into the ArrayList. This is a very simple way to store our data and is not appropriate for a real-world application. In a later chapter, we’re going to revisit this to use a more appropriate data storage mechanism.
Storing account objects: Each account, represented as an instance of the Account
class, is stored in an ArrayList
. This allows us to dynamically add, remove, and access accounts as needed.
Dynamic resizing: Unlike static arrays, ArrayList
can dynamically resize itself. This is particularly useful for our bank application as the number of accounts can increase over time.
Data persistence: In this version of the application, data is not persisted to disk. When the application is closed, all data is lost.
Onwards to the implementation of the Bank
class that will manage our accounts. Go ahead and create a new class in BankApp\Bank.dbl
:
import System.Collections
namespace BankApp
class Bank
private accounts, @ArrayList
private accountIdOffset, int
;; Constructor for BankApplication
public method Bank
accountIdOffset, int
proc
accounts = new ArrayList()
this.accountIdOffset = accountIdOffset
endmethod
;; Method to create a new account
public method CreateAccount, @Account
name, @string
record
newAccount, @Account
proc
newAccount = new Account(accounts.Count, name)
accounts.Add(newAccount)
mreturn newAccount
endmethod
;; Method to find an account by account number
private method FindAccount, @Account
accountId, int
record
realId, int
proc
realId = accountId - accountIdOffset
mreturn (@Account)accounts[realId]
endmethod
;; Method to deposit money into an account
public method Deposit, void
accountId, int
amount, d.
record
account, @Account
proc
account = FindAccount(accountId)
if (account != ^null)
begin
account.Deposit(amount)
end
endmethod
;; Method to withdraw money from an account
public method Withdraw, boolean
accountId, int
amount, d.
record
account, @Account
proc
account = FindAccount(accountId)
if (account != ^null) then
mreturn account.withdraw(amount)
else
mreturn false
endmethod
;; Method to get account balance
public method GetBalance, d28.2
accountId, int
record
account, @Account
proc
account = FindAccount(accountId)
if (account != ^null) then
begin
mreturn account.Balance
end
else
mreturn 0.0
endmethod
public property TotalAssets, d28.2
method get
record
total, d28.2
proc
total = 0.0
foreach data anAccount in accounts as @Account
begin
total += anAccount.Balance
end
mreturn total
endmethod
endproperty
endclass
endnamespace
The CreateAccount
method in your Bank
class is a factory method. CreateAccount
encapsulates the logic for creating new Account
instances. This ensures that all accounts are created consistently and allows for centralized control over the account creation process. By providing a method to create new accounts for a bank and insert them into the accounts list, you simplify the client code. Users of your Bank
class don’t need to know the details of how accounts are managed; they just call CreateAccount
.
In the future, if the process of account creation becomes more complex (e.g., adding validation or additional steps in account setup), these changes can be made in one place without affecting the client code. This makes your code more maintainable and scalable.
Let’s put it all together in a simple console application. Go ahead and open BankApp\Program.dbl
and add the following code:
import System
import BankApp
main
record
bank, @Bank
proc
bank = new Bank(1000)
bank.CreateAccount("fred");
bank.Deposit(1000, 500.0);
Console.WriteLine("Balance: " + %string(bank.GetBalance(1000)))
bank.Withdraw(1000, 200.0)
Console.WriteLine("Balance after withdrawal: " + %string(bank.GetBalance(1000)))
endmain
Now we can run our application and see the results:
dotnet run
Balance: 500.00
Balance after withdrawal: 300.00
Okay, the building blocks for accounts are in place. In the next section, we’re going to look at how to handle transfers between accounts.
Transfers
Rather than implement the transfer logic in the Account
class, we’re going to create a Transaction
abstract class that will be the base class for Transfer
and any future transaction types we might want to add. Let’s take a look at a diagram of our system with the Transaction
bits added:
For our very simple case, this is clearly more work than we need to do, but let’s talk about why this sort of design decision matters in larger applications.
Single responsibility principle (SRP)
The single responsibility principle states that a class should have only one reason to change. This principle advocates for separating different functionalities into distinct classes. We don’t have to mindlessly follow this principle but it’s a good idea to keep it in mind when designing your classes. Let’s take a look at some of the benefits of following this principle:
- The
Account
class focuses solely on individual account properties and basic operations like deposit and withdraw. - The
Bank
class manages collections of accounts and higher-level operations involving multiple accounts. - The
Transaction
class (and its subclasses) deal exclusively with transaction-specific logic and behaviors.
A modular system promotes reusability of components. By encapsulating the transaction logic within its own class hierarchy, these components can be reused in different contexts without duplication of code.
- You can easily extend the system to include different types of transactions (like direct debits, standing orders, or interest applications) without altering the existing
Account
orBank
classes. - This modular design makes maintenance and extension of the system more manageable.
Separating concerns enhances maintainability and readability. Each class and method does one thing and does it well, making the code easier to understand and maintain.
- The separation of transaction logic into its own class hierarchy makes the system easier to navigate and understand. Changes in one part of the system are less likely to cause unintended side effects in other parts.
A system designed with separate classes for different functionalities is more flexible and easier to scale.
- Introducing new transaction types or changing existing transaction logic can be done without impacting the core account management functionalities.
- The system can evolve to handle more complex scenarios, like multi-step transactions or conditional transactions, without significant restructuring.
Okay, enough loosely applicable theory—let’s actually code it. To incorporate the concept of inheritance and further demonstrate object-oriented principles in this bank application, we can design a hierarchy of transaction types. In this revised approach, we’ll create a base class named Transaction
and then derive specific transaction types like Transfer
from this base class.
Transaction class hierarchy
Abstract transaction class
Having one class per file is generally a good idea. Let’s create a new file named Transaction.dbl
, add it to BankApp.synproj
as we’ve done for the other source files, and add the following code:
namespace BankApp
;; Define the base Transaction class
abstract class Transaction
protected accountNumber, int
protected amount, d28.2
public method Transaction
accountNumber, int
amount, d.
proc
this.accountNumber = accountNumber
this.amount = amount
endmethod
;; Abstract method to execute the transaction
public abstract method Execute, boolean
bankApp, @Bank
proc
endmethod
;; Getters for transaction details
public property AccountNumber, int
method get
proc
mreturn accountNumber
endmethod
endproperty
public property Amount, d28.2
method get
proc
mreturn amount
endmethod
endproperty
public abstract property TransactionType, String
method get
endmethod
endproperty
endclass
endnamespace
Transfer class
Let’s create a new file named Transfer.dbl
, add it to BankApp.synproj
, and add the following code:
;; Define the TransferTransaction class extending Transaction
namespace BankApp
class Transfer extends Transaction
private toAccountNumber, int
public method Transfer
fromAccountNumber,int
toAccountNumber, int
amount, d.
endparams
parent(fromAccountNumber, amount)
proc
this.toAccountNumber = toAccountNumber
endmethod
public override method Execute, boolean
bankApp, @Bank
proc
;; Withdraw from the source account
if (bankApp.Withdraw(accountNumber, amount))
begin
;; Deposit to the destination account
bankApp.Deposit(toAccountNumber, amount)
mreturn true
end
mreturn false
endmethod
public property ToAccountNumber, int
method get
proc
mreturn toAccountNumber
endmethod
endproperty
public override property TransactionType, String
method get
proc
mreturn "TRANSFER"
endmethod
endproperty
endclass
endnamespace
Now we have our Transfer
class that extends the Transaction
class. We can now update main
to use the new Transfer
class:
import System
import BankApp
main
record
bank, @Bank
proc
bank = new Bank(1000)
bank.CreateAccount("fred");
bank.CreateAccount("fred2");
bank.Deposit(1000, 500.0);
Console.WriteLine("Balance: " + %string(bank.GetBalance(1000)))
bank.Withdraw(1000, 200.0)
Console.WriteLine("Balance after withdrawal: " + %string(bank.GetBalance(1000)))
new Transfer(1000, 1001, 100.0).Execute(bank)
Console.WriteLine("Balance after transfer: " + %string(bank.GetBalance(1000)))
endmain
This should print out the expected output:
Balance: 500.00
Balance after withdrawal: 300.00
Balance after transfer: 200.00
If you get an error about Transfer
or Transaction
not being found, make sure you’ve added the new files to BankApp.synproj
.
Exercise
Try extending the Account
class: Add a new field to the Account
structure, such as DateOpened
, and update the Account
class to handle this new field. The companion repository contains a solution to this exercise.
Implement and use a new transaction type: Create a new transaction type, such as InterestApplication
, which inherits from the Transaction
class. This transaction should apply a fixed interest rate to the account balance.
Once you’ve completed the exercises, it’s time to move on to testing.
If It’s Not Tested, It Doesn’t Work
ProcessFile
method is called to start the file processing.- Inside
ProcessFile
,ValidateFilePath
is first called to check the validity of the file path. - Next,
DoSomeStuff
is called to perform some operations on the file. - Inside
DoSomeStuff
,ReadFile
is called to read the file content, but let’s assume the file doesn’t exist, so aNoFileFoundException
is thrown.
At this point, stack unwinding begins due to the unhandled NoFileFoundException
:
- The runtime starts the unwinding process, looking for a CATCH block that can handle a
NoFileFoundException
. - It first checks the current method,
ReadFile
, but doesn’t find a suitable exception handler. - The runtime then leaves the
ReadFile
method and moves to the previous method in the call stack, which isDoSomeStuff
. - In
DoSomeStuff
, it finds no CATCH block to handle theNoFileFoundException
, but it does find a FINALLY block. The FINALLY block is executed, and the runtime continues unwinding inProcessFile
- In
ProcessFile
, it finds a CATCH block that handles the baseException
type (which can catch any exception). TheNoFileFoundException
is successfully caught here, and the error message is printed to the console. - The FINALLY block, if it existed within the TRY-CATCH block of
ProcessFile
, would execute after the CATCH block, regardless of whether an exception was caught.
During this unwinding process, the stack is effectively being “unwound” from the point of the exception (inside ReadFile
) back up through the nested method calls (DoSomeStuff
and ProcessFile
) until it finds an appropriate handler. If more method calls were nested, the runtime would continue unwinding through them in the same manner.
Note about local calls
There’s really no interaction with local calls during stack unwinding. The compiler will not allow you to call a local procedure within a TRY-CATCH block. The error DBL-E-LBLSCOPE tells you that the label is out of scope. While it is technically correct, it could be more informative.
Exceptions in legacy code
An error list is more efficient and can be easier to fully understand when working with Synergy I/O statements. ONERROR is nasty business, and you should avoid or rework it whenever it’s possible to do so safely. TODO: explain that all error handling can be converted to exceptions. Give a good example of onerror -> exception and error list -> exception
Defining custom exceptions
TODO: Explain how to define custom exceptions and why you might want to do so.
ONERROR
ONERROR is an error handling mechanism that allows a program to specify how to proceed when a runtime error occurs. When an ONERROR statement is active, if the program encounters an error, instead of halting execution, it redirects flow to a predefined label, effectively a section of code designated to handle errors. This mechanism is a form of exception handling that was available prior to the adoption of TRY-CATCH blocks. It allows the programmer to maintain control over the program’s behavior in the face of errors, typically by logging the error, cleaning up resources, or attempting to recover gracefully. However, it is considered less sophisticated than modern exception handling techniques because it can lead to complex error control flows and difficulty in maintaining code, especially in larger systems.
Syntax
ONERROR label
Or, for more specific error handling:
ONERROR(error_list) label[, label2, ...][, catch_all]
For clearing error traps:
OFFERROR
Components:
-
label
: This is a statement label that acts as a go-to target when an error occurs. The program execution jumps to the code below this label. -
error_list
: A comma-separated list of error codes. If one of these errors occurs, the execution will jump to the associated label. These must be numerical values that the compiler can resolve at compile time. -
catch_all
: An optional label that serves as a catch-all error handler if an error occurs that is not included in the specifiederror_list
.
Behavior:
-
The unqualified form (
ONERROR label
) sets a trap for any runtime error within the current subroutine/function/method, redirecting execution to the specifiedlabel
regardless of the error type. This has historically been described as the “global” error trap, but it is not truly global; it is only scoped to the current subroutine/function/method. -
The qualified form (
ONERROR(error_list) label
) allows for finer control, specifying particular errors to catch and where to jump in those cases.
Notes:
-
When an ONERROR statement is set, it overrides any previous ONERROR traps.
-
ONERROR traps remain active until one of the following occurs:
- An OFFERROR statement is executed to clear error traps.
- A new ONERROR statement is executed, replacing the current traps.
- The execution of an external subroutine or function suspends the traps, which are reinstated upon return (in traditional DBL environments).
- The program ends.
-
If an error occurs that is not specified in any ONERROR statement or error list and no TRY-CATCH block is present to handle it, the program will generate a fatal traceback.
-
ONERROR provides a way to implement error handling in environments that do not support structured exception handling, but it must be used with caution to maintain clear and maintainable error-handling logic.
Difficulties with ONERROR
ONERROR can complicate code maintenance for several reasons, especially in codebases that have large subroutines containing many local routines. Here are some of the common issues:
-
Complex control flow: ONERROR redirects the program’s execution to a specified label upon encountering an error. This can make it difficult to trace the flow of execution, because it’s not always clear where an error will transfer control, especially if there are multiple ONERROR statements and you aren’t sure which one was last executed.
-
Error context loss: When an error occurs and ONERROR triggers a jump to a label, the context in which the error happened may be lost unless explicitly preserved. This loss of context makes debugging harder, as the original state of the program at the time of the error (such as variable values and the call stack) may not be readily available.
-
Scattered error handling: ONERROR tends to encourage scattered error handling logic, because each error case may jump to a different part of the code. This scattered approach makes understanding and revising error handling more cumbersome since the logic is not centralized.
-
Inadvertent error masking: If the error handling code does not address the error adequately or fails to re-raise the error for upper levels to handle, it can inadvertently mask errors. This may lead to scenarios where errors go unnoticed or are incorrectly logged, making debugging and maintenance challenging.
-
Lack of structured cleanup: Modern exception handling with TRY-CATCH often comes with a FINALLY block, which is guaranteed to run for resource cleanup. ONERROR lacks this structured approach to cleanup, leading to potential resource leaks if the error handling code does not manually clean up every resource that may have been in use at the time of the error.
-
Maintenance overhead: New developers or even experienced developers not familiar with the original design might find it hard to add new features or fix bugs without introducing new bugs because understanding the existing ONERROR implementations requires a significant amount of time and effort.
-
Difficulty in refactoring: Because of the interwoven nature of error handling with ONERROR, refactoring—such as breaking down large subroutines into smaller, more manageable pieces—becomes much harder. There’s a risk of altering the error handling behavior inadvertently, which can introduce new bugs into the system.
Refactoring ONERROR
If you’re thinking about refactoring to remove ONERROR, here are a few strategies to consider:
-
Assess the error handling scope: Review the existing ONERROR implementation to determine the scope of errors being handled. Identify the types of errors, especially those related to I/O operations, since they are candidates for I/O error lists.
-
Implement I/O error lists: For file I/O operations, replace ONERROR with specific I/O error lists. This means that within each I/O statement (like READ, WRITE, OPEN, etc.), define an error list to handle anticipated I/O errors directly. This approach has the benefit of localized error handling, which makes the code easier to understand and debug.
-
Use structured exception handling: For errors beyond I/O, or where a more sophisticated error recovery is required, replace ONERROR with TRY-CATCH blocks. This more modern approach to error handling provides a clear structure for handling exceptions and can make the code more robust and maintainable.
-
Selective refactoring: Decide when to use I/O error lists or exceptions based on the context. For example, if the error handling is complex or needs to account for many different kinds of errors, exceptions might be more appropriate. Conversely, for simple, I/O-specific errors, I/O error lists are simpler and more efficient.
-
Consistency and conventions: Apply consistent error handling conventions across the codebase. This might involve setting up a standard way of dealing with certain types of errors or creating utility routines for common error handling patterns.
-
Test error conditions: After refactoring, rigorously test all error conditions to ensure that the new error handling works as expected. Automated tests can be particularly useful here to simulate various error conditions.
-
Documentation and comments: Update code comments and documentation to reflect the new error handling mechanisms. This is crucial for maintaining the code in the future and for aiding other developers in understanding the error handling approach.
-
Review performance implications: Evaluate the performance implications of using exceptions versus I/O error lists. Exceptions can be more costly in terms of performance, so for critical sections of code where performance is key and error conditions are well-understood and limited, I/O error lists might be preferred.
-
Educate the team: If the refactoring is part of a team effort, ensure all team members understand the changes and the reasons behind them. Training sessions or code reviews can be helpful to disseminate the knowledge.
-
Deprecate ONERROR gradually: In a large codebase, it may not be practical to eliminate all ONERROR usages at once. Instead, deprecate it gradually, starting with the most critical or easiest-to-refactor parts of the application.
Error List
In a DBL program, an I/O error list is associated with a file I/O statement. It details how the program should handle specific I/O errors by redirecting program control to error-handling labels. Here’s the general syntax structure:
IO_statement(arguments)[error_list]
Where:
IO_statement
: This is the file I/O statement that the error list applies to, such as READ, WRITE, OPEN, FIND, etc.error_list
: This is a comma-separated list of error handlers in the formerror_code=label
. You can also specify a catch-all error handler if you just put a label at the end of the list.
Components of the error list
error_code
: This is a specific error condition you want to handle, such as $ERR_EOF, $ERR_LOCKED, $ERR_NODUPS, etc. It represents a class of error that the I/O operation might encounter.label
: This is a user-defined label in the program that execution will jump to if the corresponding error occurs.
Example of an I/O error list:
READ(channel, recordArea, key) [$ERR_EOF=notfound_label, $ERR_LOCKED=retry_label, catch_all_label]
In the above example,
- READ: DBL I/O statement that attempts to read a record.
recordArea
: The variable into which the read data will be placed.key
: The key value to use for the read operation.channel
: The channel from which the data is being read.$ERR_EOF=notfound_label
: If an end-of-file condition is encountered, control transfers tonotfound_label
.$ERR_LOCKED=retry_label
: If a record lock error occurs, control transfers toretry_label
.catch_all_label
: If any other error occurs, control transfers tocatch_all_label
.
Extended example with multiple I/O statements:
main
record myData
fld1, d10
fld2, a100
endrecord
record
chn, int
proc
open(chn = 0, I:I, "myfile")[$ERR_FNF=file_not_found]
repeat
begin
retry_label,
READS(chn, myData)[$ERR_EOF=end_label, $ERR_LOCKED=retry_label, catch_all]
nextloop
end_label,
Console.WriteLine("Reached the end of the file")
exitloop
catch_all,
Console.WriteLine("something went wrong the error code was: " + %string(syserr))
nextloop
end
exite
file_not_found,
Console.WriteLine("The file wasnt found")
endmain
Persisting Your Data: ISAM
In this chapter, we delve into the input/output (I/O) operations available and commonly used by DBL programmers. I/O operations form the backbone of any application, providing essential capabilities to interact with various data sources and destinations. Our journey will traverse the spectrum of I/O functionalities in DBL, starting from the basics of simple terminal I/O, advancing through the intricacies of handling raw and ISAM files, and culminating in networked file access and semi-relational data manipulation using the powerful Select class. By exploring these built-in routines and classes, you will gain a robust understanding of how DBL stores, receives, and accesses data, whether it be reading user input from the console or managing large volumes of data in ISAM files. This chapter is designed to not only impart technical knowledge but also to equip you with practical skills for transforming your codebase to best take advantage of modern hardware.
A core concept essential for understanding input/output (I/O) operations is the mechanism of channels. These channels are at the heart of how DBL manages various types of I/O operations, whether they are related to files or terminal interactions. To kick off the topic of I/O, let’s delve into a few key aspects of channels.
Numeric representation of I/O handles
- Range: In traditional DBL platforms, I/O channels are numerically designated and range from 1 to 1024. In DBL implementations on the .NET framework, this range extends up to 8096.
- Function: Each channel number uniquely identifies an I/O stream. This could be a file, a terminal session, a network port, or another input/output device.
The OPEN statement
The OPEN statement in DBL is used to initiate an I/O operation. It associates a specific channel number with a file or device, setting up the parameters for subsequent I/O interactions. The syntax for the commonly used parts of the OPEN statement is as follows:
OPEN(channel, mode_spec, file_spec) [error_list]
Logicals
TODO: Write something about the usage of logicals in the OPEN statement. DAT: bla.ism
Case sensitivity and extensions
TODO: Write something about default extensions and case sensitivity on Linux platforms
Dynamic channel allocation
- Automatic assignment: When opening a file or terminal, if the channel variable is set to zero, the DBL runtime automatically assigns the next available channel number. This channel number is then updated in the variable. This is the only thread-safe way to allocate channels. Using the %SYN_FREECHN routine is strongly discouraged due to the very subtle, difficult-to-track-down, catastrophic bugs that will be introduced.
- Search order: The runtime searches for a free channel starting from the highest number (1024 or 8096) and working downward.
- Data type requirements: The variable intended to hold the channel number must be capable of storing a four-digit number to avoid a BIGNUM error. Older codebases may use
d
fields for this purpose, but I recommend usingi4
fields instead.
Application in terminal and file I/O
- Terminal I/O: For terminal operations, channels link to terminal sessions, enabling data to be read from or written to the terminal. You can specify that you’re opening a terminal channel by passing “tt:” as the file specification.
- File I/O: In file operations, channels are used to perform all of the basic file operations. A given file can be opened multiple times, each with its own independent channel number.
Channel context
Particularly with file channels, there is state that follows a channel (for example, the current position in the opened file along with any record locks that have been taken). We’ll get into much more detail later in this chapter.
ISAM
Indexed Sequential Access Method (ISAM) files represent a cornerstone in database file organization, offering a blend of sequential and direct access methods. Essentially, ISAM files are structured to facilitate both efficient record retrieval and ordered data processing. They consist of two primary components: the data records themselves and a set of indexes that provide fast access paths to these records. This makes ISAM files particularly suitable for customer databases, inventory systems, and any context where data needs to be quickly accessed using keys yet also iterated in a specific order. The power of ISAM lies in its ability to handle large volumes of data with quick access times. Synergy ISAM files support multiple keys on a single file, which allows for fast lookups even when you have multiple access patterns. As we delve deeper into the workings of ISAM files in DBL, it’s helpful to think of your application as running inside the database, rather than just querying it from a distance. I’ll try to point out the places where this matters most as we work though this chapter.
CRUD
CRUD, an acronym for create, read, update, and delete, represents the essential set of operations in database management and application development, forming the cornerstone of most data-driven systems by enabling the basic interaction with stored information. ISAM files in DBL support all four of these operations, and we’ll cover these basics and a few extras in this section.
Common concepts
- Channel: All of these operations require an open channel to the ISAM file. This channel is used to identify the file and track its state (for example, the current position in the file and any locks that have been taken).
- Data area: If you’re reading, creating, or updating a record, you’ll need a data area to hold the record. This is a variable that is defined as a structure or alpha that matches the size of the records in the file.
- GETRFA: Returns the record’s global record file address (GRFA). If you pass an argument with space for 6 bytes, this will be filled with the RFA. If you pass an argument with space for 10 bytes, it gains a 4-byte hash that can be used to implement optimistic locking. This feature will be covered in more detail later in this chapter. TODO: mention the stability that comes from static rfas and also the loss of them if you rebuild a file
- error_list This argument allows you to specify an I/O error list for robust error management, which ensures the program can handle exceptions like locked records or EOF conditions gracefully.
Create
STORE(channel, data_area[, GETRFA:new_rfa][, LOCK:lock_spec]) [[error_list]]
The STORE statement adds new records to an ISAM file. The statement requires an open channel in update mode (U:I) and a data area that contains the record to be added.
Optional qualifiers:
- GETRFA: This optional argument is used to return the record file address (RFA) of the newly added record, which can be essential for subsequent operations like updates or deletions.
- LOCK: While automatic locking is not supported for STORE, this optional qualifier can be used to apply a manual lock to the newly inserted record.
When executing a STORE, DBL checks for key constraints. For instance, if the ISAM file is set to disallow duplicate keys, attempting to store a record with a duplicate key will trigger a “Duplicate key specified” error ($ERR_NODUPS). It’s important to note that if the data area exceeds the maximum record size defined for the ISAM file, an “Invalid record size” error ($ERR_IRCSIZ) is raised, and the operation is halted.
Transactional capability:
The combination of the LOCK
qualifier and the GETRFA
argument in a STORE statement lends a degree of transactional capability to the operation. The lock ensures that the record remains unaltered during subsequent operations, while the RFA provides a reference to the specific record. This is not commonly seen in DBL applications, but it’s worth noting in case you find yourself in a situation where you need to implement some form of transactional capability.
Basic example:
These examples make use of the ISAMC subroutine to create a new ISAM file with a single key. ISAMC will be explained later in more detail but for now we’ll just use it without explanation TODO: Add example
Read
READ(channel, data_area, key_spec[, KEYNUM:key_num][, LOCK:lock_spec]
[, MATCH:match_spec][, POSITION:position_spec][, RFA:rfa_spec]
[, WAIT:wait_spec]) [[error_list]]
The READ statement is used to read a specified record from a file that’s open on a given channel. It’s vital for accessing data in ISAM files, where the channel must be opened in input, output, append, or update mode.
- Key arguments:
channel
: Specifies the channel on which the file is open.data_area
: A variable that receives the information from the read operation.key_spec
: Defines how the record is located, either by its position, key value, or qualifiers like^FIRST
or^LAST
.
Additional qualifiers:
- KEYNUM: Optional. Specifies a key of reference for the read operation.
- LOCK: Optional. Determines the locking behavior for the record being read.
- MATCH: Optional. Defines how the specified key is matched, which is crucial for precise data retrieval.
- POSITION: Optional. Allows you to position to the first or last record in the file rather than doing a key lookup.
- RFA: Optional. The RFA argument in the READ statement specifies the exact address of a record to be read directly from the file, facilitating targeted data retrieval. The size of the RFA argument changes its behavior: a 6-byte RFA represents just the locator, pinpointing the record’s location, while a 10-byte RFA includes both the locator and a hash value. The hash serves as a data integrity check; if the record at the specified address doesn’t match the hash, indicating that the data has been altered or is no longer valid, a “Record not same” error ($ERR_RECNOT) is triggered.
- WAIT: Optional. Specifies how long the program should wait for a record to become available if it’s locked. This is more efficient than calling READ in a loop with a SLEEP statement because the call will be blocked until the record is available rather than waiting the entire sleep duration and then checking again.
Functional insights:
- The
LOCK
qualifier specifies whether a manual lock is applied to the record. This feature is critical in concurrent environments where data integrity is paramount. - The
MATCH
qualifier defines how a record is located. This determines the behavior when an exact match isn’t found. For historical reasons, the default behavior isQ_GEQ
, but that is likely to lead to an unpleasant surprise for someone writing new code. The following options are available:- Q_GEQ (0): Finds a record with a key value greater than or equal to the specified key. If an exact match isn’t found, it reads the next higher record, potentially raising a “Key not same” error ($ERR_KEYNOT) if no exact match is found or an “End of file” error if no higher record exists.
- Q_EQ (1): Searches for a record that exactly matches the specified key value. If there’s no exact match, it triggers a “Record not found” error ($ERR_RNF).
- Q_GTR (2): Looks for a record with a key value greater than the specified key. It only raises an “End of file” error if it reaches the end of the file without finding a matching record.
- Q_RFA (3): Locates a record using the record file address specified in the RFA qualifier.
- Q_GEN (4): Finds a record whose key value is equal to or the next in sequence after the specified key. An “End of file” error occurs if no such records are found.
- Q_SEQ (5): Retrieves the next sequential record after the current one, regardless of the key specification. It raises an “End of file” error only if it reaches the end of the file.
Practical application: When you lock a record after a READ operation in a database, you are essentially ensuring that the data remains unchanged until your next action, be it a DELETE or a WRITE. This lock is how you maintaining data consistency because it prevents other users or processes from modifying the record during this interim period. In environments where multiple transactions occur simultaneously, locking is key to preventing conflicts. Without locking, there’s a risk that another transaction might alter the data after you’ve read it but before you’ve had the chance to update or delete it, leading to inconsistencies. By locking the record, you ensure that any subsequent WRITE or DELETE reflects the state of the data as it was when you first read it, thereby preserving the accuracy and integrity of your database operations. It’s very important that applications not keep records locked for indefinite periods of time. This can lead to deadlocks, performance issues, and frustrated users.
On the other side of locking, the WAIT
argument exists to help you gracefully handle locked records in multi-user environments or scenarios involving concurrent data access. This argument specifies the duration for which the READ operation should wait for a locked record to become available before timing out.
In terms of error control, appropriate use of the WAIT
argument can prevent your application from failing immediately when encountering locked records. By setting a reasonable wait time, you allow your application to pause and retry the READ operation, thereby handling temporary locks effectively without leading to spurious errors or immediate application failure.
However, developers need to be careful to avoid potential infinite loops or excessively long wait times. This caution is particularly important in high-throughput systems where waiting for extended periods can significantly impact performance. One approach is to implement a retry logic with a maximum number of attempts or a cumulative maximum wait time. After these thresholds are reached, the application can either log an error, alert the user, or take alternative action. This strategy ensures that the application retries sufficiently to handle temporary locks but not indefinitely, thus maintaining a balance between robust error handling and application responsiveness.
Update
WRITE(channel, data_area[, GETRFA:new_rfa]) [[error_list]]
The WRITE statement in DBL is used to update a record in a file. When you execute a WRITE operation on a channel that is open in output, append, or update mode, the specified record is updated with the contents of the data area.
GETRFA
is optional. It returns the RFA after the WRITE operation, which is useful in files with data compression, for variable-length records, or when trying to get the new hash for the updated record.
Constraints: The record being modified must be the one most recently retrieved and locked. Modifying unmodifiable keys or primary keys directly with WRITE is not allowed and will result in errors. If the data area exceeds the maximum record size, an “Invalid record size” error ($ERR_IRCSIZ) occurs.
Immutable keys
Individual keys in an ISAM file can be marked as immutable, meaning they cannot be updated. This constraint is enforced when calling WRITE on a record with an immutable key. Immutable keys are a choice in database design that can play a role in maintaining data integrity, ensuring consistency, and simplifying database management.
Immutable as identifiers: One primary reason to enforce immutable keys is to preserve the integrity of unique identifiers. In many database designs, certain key fields serve as definitive identifiers for records, akin to a Social Security number for an individual. Allowing these key values to change can lead to complications in tracking and referencing records, potentially causing data integrity issues. For example, if a key field used in foreign key relationships were allowed to change, it could create orphaned records or referential integrity problems.
Avoiding designs with complex updates: Allowing keys to change can lead to complex update scenarios where multiple related records need to be updated simultaneously. This can increase the complexity of CRUD operations, making the system more prone to errors and harder to maintain. By enforcing immutability, developers can simplify update logic, as they don’t have to account for cascading changes across related records or indexes.
Delete
Soft delete vs hard delete
Delete is a pretty fundamental operation in DBL applications, but your codebase has probably already decided how it’s going to handle deletes. We’re going to cover both soft and hard approaches here to help you better understand the tradeoffs that your application’s designers originally made.
Soft deletes
Soft delete is a form of delete where a record is not physically removed from the database; instead, it’s marked as inactive or deleted. This is usually implemented using a flag, such as is_deleted
, or a timestamp field, like deleted_at
, to indicate that the record is no longer active. If you need to delete a record in a soft-delete system, the order of operations is as follows:
- READ the record.
- Update the field in the record to mark it as deleted.
- WRITE the record back to the file.
Advantages:
- Data recovery: Soft deletes allow for easy recovery of data. Mistakenly deleted records can be restored without resorting to backup data, which is especially useful in user-facing applications where accidental deletions might occur.
- Audit trails and historical data: Maintaining historical data is essential in many systems for audit trails. Soft deletes enable you to keep a complete history of all transactions, including deletions, without losing the integrity of historical data.
- Referential integrity: In databases with complex relationships, soft deletes help maintain referential integrity. Removing a record might break links with other data. Soft deletes prevent such issues, ensuring that all relationships are preserved.
Hard deletes
Hard delete is the complete removal of a record from the database. Once a record is hard deleted, it is permanently removed from the database table.
DELETE(channel) [[error_list]]
The DELETE statement is used to remove a record from a file. DELETE requires an open channel in update mode (U:I) positioned to a record that previously has been locked. If you need to delete a record in a hard delete system, the order of operations is as follows:
- READ or FIND the record with automatic locking or manual locking enabled.
- Call DELETE on the same channel as the READ or FIND operation.
Considerations:
- Space efficiency: Hard deletes free up space in the database, making them suitable for applications where data storage is a concern.
- Simplicity: Hard deletes can be simpler to implement and manage, as they involve straightforward removal without the need for additional mechanisms to handle the deleted state.
- Privacy compliance: For certain data, especially personal or sensitive information, hard deletes might be necessary to comply with privacy regulations like GDPR, where individuals have the “right to be forgotten.”
Why opt for soft deletes
A system might be designed to use soft deletes for several reasons:
- Undo capability: Soft deletes provide an “undo” option, which is crucial in many business applications for correcting accidental deletions without resorting to more complex data restoration processes.
- Data analysis and reporting: Having historical data, including records marked as deleted, can be valuable for trend analysis and reporting. Soft deletes allow you to maintain and query this historical data.
- System integrity: In systems with intricate data relationships, soft deletes help maintain the integrity of the system by ensuring that deleting one record does not inadvertently impact related data.
Beyond CRUD
Sequential access
Sequential access in databases, exemplified by the READS statement in DBL, is a method where records are processed in an order determined by a key. For certain tasks, this approach is particularly advantageous, versus reading individual records one at a time, for several reasons:
-
Efficiency in processing ordered data: When data needs to be processed in the order it’s stored (such as chronological logs or sorted lists), sequential access allows for a streamlined and efficient traversal. It eliminates the overhead associated with locating each record individually based on specific criteria or keys.
-
Optimized for large-volume data handling: Sequential access is ideal for operations involving large datasets where each record needs to be examined or processed. Accessing records in sequence reduces the computational cost and complexity compared to individually querying records, especially in scenarios like data analysis, report generation, or batch processing.
-
Simplicity and reduced complexity: Implementing sequential access in applications simplifies the code, as it follows the natural order of records in the file. This contrasts with the more complex logic required for random or individual record access, where each read operation might require a separate query or key specification.
READS(channel, data_area[, DIRECTION:dir_spec][, GETRFA:new_rfa][, LOCK:lock_spec]
[, WAIT:wait_spec]) [[error_list]]
Basics of READS:
- The READS statement is designed to retrieve the next sequential record from a file, based on the collating sequence of the index of reference. This means that it reads records in either ascending or descending order, depending on the key’s ordering.
Navigating through records:
- READS is particularly adept at navigating through records in a file, moving logically from one record to the next. It maintains a context of “next record,” established through operations like READ and FIND.
- READS does not perform any filtering or matching; it simply reads the next record in the file based on the order of the key that was used to position it. If you want to stop reading records after the criteria is no longer matching, you’ll need to check manually against the data area or use the Select class.
Directional reading:
- With the
DIRECTION
qualifier, READS offers the ability to traverse records in reverse order. In the distant past before this functionality existed, keys had to be declared reverse order to support this access pattern, so you may still see files where there are two keys that differ only by the declared order.
Error handling and record locking
Handling EOF and errors:
- When READS encounters the end (or beginning, when reverse reading) of the file, it triggers an “End of file” error ($ERR_EOF). This error sets the current record position to undefined, so it’s important to handle it appropriately. This can be done by checking for the $ERR_EOF error and taking appropriate action, such as exiting the loop or performing cleanup operations.
Record locking:
- In update mode, READS automatically locks the retrieved record, ensuring that the data remains consistent during any subsequent update operations. TODO: rewrite This automatic locking is crucial in scenarios where data integrity is paramount, particularly in multi-user environments.
Best practices and considerations
Consider using Select instead of READS for sequential access, especially if you need to filter the records you’re reading. Select is covered in more detail later in this chapter. TODO: there are more best practices for READS to cover here
Repositioning, checking for existence, or targeted locking
The FIND statement is designed to position the pointer to a specific record in a file, setting the stage for its subsequent retrieval. Unlike the READ statement, which fetches the record’s data immediately, FIND simply locates the record, allowing the next READS statement on the same channel to retrieve it.
FIND(channel[, record][, key_spec][, GETRFA:new_rfa][, KEYNUM:krf_spec]
& [, LOCK:lock_spec][, MATCH:match_spec][, POSITION:pos_spec]
& [, RFA:match_rfa][, WAIT:wait_spec]) [[error_list]]
Key parameters:
- record: Don’t use this argument.
- key_spec: This works the same as READ and is used to specify the key to use for locating the record.
Special qualifiers: These special qualifiers all work the same as READ and are used to control the behavior of the FIND operation.
- KEYNUM
- LOCK
- MATCH
- POSITION
- RFA
- WAIT
Positioning vs retrieving:
- While READ retrieves and locks the record in one operation, FIND only attempts to position the pointer to the record.
Use cases for FIND:
- FIND will set the context for a future READS operation. This has been used in applications to slightly simplify the READS loop to avoid the specialized initial step of dealing with the first record.
- FIND is useful if you need to check for the existence of a record without retrieving it, for example, if you want to check for duplicates or for the existence of a record before creating it.
- FIND is also useful if you need to lock a record without retrieving it, for example, for optimistic locking or for locking a record before overwriting it.
Locking behavior:
- By default, FIND does not lock the located record unless explicitly specified with the
LOCK
qualifier or enabled through compiler options. READ, on the other hand, automatically locks the retrieved record if used on a channel opened for update.
Keyed lookups
Keyed lookups in ISAM files are a fundamental aspect of database operations in DBL, offering a distinct approach compared to non-ordered keyed lookups, such as those in a hash table. Understanding these differences, as well as the concept of partial keyed reads, is crucial for database developers.
Keyed lookups in ISAM files
Ordered nature of ISAM lookups:
- ISAM files in DBL utilize ordered keys for data retrieval, meaning the records are sorted based on the key values. This ordered structure enables efficient range queries and sequential access based on key order. For example, you can quickly locate a record with a specific key or navigate through records in a sorted manner.
Comparison with hash table lookups:
- Unlike ISAM files, hash tables are typically unordered. A hash table provides fast access to records based on a hash key, but it doesn’t maintain any order among these keys. While hash tables excel in scenarios where you need to quickly access a record by a unique key, they don’t support range queries or ordered traversals as ISAM files do.
Efficiency considerations:
- In scenarios where the order of records is important, ISAM files are more efficient than hash tables. The ability to perform range scans and ordered retrievals makes ISAM files preferable for applications where such operations are frequent.
Partial keyed reads in ISAM
Understanding partial keyed reads:
- Partial keyed reads in ISAM files allow for searches based on a portion of the key, starting from the leftmost part. This means you can perform lookups using only the beginning segment of the key, and the search will return records that match this partial key.
Left-to-right application in keyspace:
- The key structure in ISAM files is significant when it comes to partial keyed reads. The search considers the key’s structure from left to right. For instance, if you have a composite key made of two fields, say, region and department, a partial key search with just the region will return all records within that region, regardless of the department.
Use case scenarios:
- Partial keyed reads are particularly useful in scenarios where data is hierarchically structured. For instance, if you’re dealing with geographical data, you might want to retrieve all records within a certain area without specifying finer details initially. Partial keyed reads allow this level of query flexibility.
In this diagram
- Each
EmployeeRecord
is represented with different key orderings: byid
,dept
,region
, andmanager
. - The records are sorted based on each key ordering, demonstrating how the positioning of the key fields affects the sorting and access pattern.
- The sorting order shows how queries and partial key reads would be efficient for different key combinations, illustrating the significance of key ordering in database design and querying.
- The sort order of a composite key (
manager + id
) shows that if you partially key read on themanager
field, you’ll get all the records for that manager in order byid
. - Region as defined here must be a non-unique key because there are multiple records with the same region. This is a common pattern in ISAM files where you have both a unique key and a non-unique key that is used for range queries. This is also true for
dept
but not forid
or (manager + id
). - FIND and READ can specify a partial key value in order to position on the first matching record in a sequence. READS will advance to the next record in the file based on the ordering of the key, but it will not filter out records that don’t match the partial key value.
If you wanted to access records by the keys in the diagram above, you would specify the KEYNUM
argument to READ or FIND. The mapping of the KEYNUM argument to the key in the file is literally the ordering you specified to ISAMC or the declaration order in your XDL file. It’s a good idea to use a variable to hold this mapping so you can change the order of the keys in the file without having to change all of your code.
Context-driven key design
Analyzing user needs:
- When designing keys, it’s essential to consider the user’s perspective and the tasks they need to accomplish. For instance, if there’s a user role focused on managing inventory in a specific warehouse, it would be beneficial to key inventory records by warehouse ID. This approach allows for rapid access and management of all items within a given warehouse.
Example:
- In an inventory management system, if users frequently access all items in a specific warehouse, designing a composite key like
[WarehouseID, ItemID]
would optimize data retrieval for these users.
Key design for data relationships
Facilitating efficient joins:
- Often, one of the main reasons for accessing a particular dataset is to join it with another. In such cases, it’s vital to design keys that can easily and accurately match the related records. The key should include fields that are present in the driving record of the join operation.
Example:
- Suppose you have an
Orders
table and aCustomers
table. If orders are frequently retrieved based on customer information, having aCustomerID
in both tables as part of their keys allows for efficient joins between these tables.
Considerations for composite keys in joins:
- When using composite keys, ensure that the key components align with the fields used in join conditions. The order of elements in a composite key can affect the efficiency of the join operation, especially in ISAM databases where key order dictates access patterns.
Optimistic concurrency
Optimistic concurrency control is a method used in database management where transactions are processed without locking resources initially, but there’s a check at the end of the transaction to ensure no conflicting modifications have been made by other transactions.
How GRFA facilitates optimistic concurrency:
- Record identification: When a record is read, its GRFA can be retrieved. This GRFA acts as a locator to get back to that record at a later time.
- Concurrent modifications: While a transaction is processing a record, other transactions might also access and modify the same record.
- Final validation: At the point of updating or committing the transaction, the current GRFA of the record is compared with the GRFA obtained at the start. If the hash part of the GRFA has changed, it indicates that the record has been modified by another transaction since it was read.
- Conflict resolution: In case of a mismatch in GRFA values, the transaction knows that a conflicting modification has occurred. The system can then take appropriate actions, such as retrying the transaction, aborting it, or triggering a conflict resolution mechanism.
Practical implications
Advantages:
- Optimistic concurrency reduces the need for locking, thereby enhancing performance in environments with high concurrency. It allows multiple transactions to proceed in parallel without waiting for locks.
- It provides a mechanism to ensure data integrity without the overhead of locking resources for the duration of the transaction.
Considerations:
- While optimistic concurrency control increases throughput, it requires careful handling of conflict scenarios. Applications must be designed to handle cases where transactions are rolled back or retried due to GRFA mismatches.
- It’s most effective in scenarios where read operations are frequent but actual conflicts are relatively rare.
- Harmony Core uses optimistic concurrency along with transactional rollback capability to ensure data integrity in concurrent environments.
GRFA stability:
- The RFA part will be stable as long as the file is not rebuilt. The hash part will be stable as long as the record is not modified. If the record is modified, the hash will change. If the file is rebuilt, the RFA will change. This means you should not store the GRFA or RFA in a file or database, because it will become invalid if the file is rebuilt. If you need to store a reference to a record, you should store a unique key value and use that to retrieve the record.
SLEEP vs WAIT
Comparing the use of a SLEEP statement in a loop for retrying a read operation with a lock to employing a WAIT
qualifier on a READ or READS statement reveals some key differences, particularly in terms of efficiency and resource management. When you use a SLEEP statement in a loop to retry a locked read operation, your approach is essentially a form of polling. In this scenario, the program repeatedly checks if the lock is released, interspersed with sleep intervals. While this method is straightforward, it can be inefficient, as it continuously consumes CPU cycles and can lead to increased response times due to the sleep duration.
On the other hand, utilizing a WAIT qualifier with a READ statement leverages operating system–level blocking I/O with a timeout. This approach is generally more efficient because it allows the operating system to suspend the execution of your thread until the lock is released or the timeout is reached. During this suspension, the CPU resources that would have been consumed by polling are freed up for other tasks. The blocking mechanism is more responsive as well, as the system can resume the thread’s execution immediately after the lock is released, without waiting for the next polling interval. This method not only improves resource utilization but also tends to offer better overall performance, particularly in high-concurrency environments or applications where minimizing the response time is crucial.
Hooks
I/O hooks in DBL provide a powerful mechanism for intercepting and customizing file I/O operations, allowing developers to inject custom logic into the lifecycle of data access without altering existing code. This capability is especially useful in scenarios where you need to add functionality like logging, data validation, transformation, or encryption transparently to the I/O process. By leveraging I/O hooks, developers can gain finer control over how their applications interact with data sources, enhancing flexibility, maintainability, and the potential for sophisticated data handling strategies. This feature is invaluable both in developing new applications and enhancing existing ones, as it enables a seamless integration of custom I/O behaviors.
Using I/O hooks
To demonstrate the power and flexibility of I/O hooks in DBL, consider a scenario where you need to add logging for every read operation on a file. Instead of modifying every instance where a read operation occurs, you can create a custom IOHooks class that overrides the read_pre_operation_hook
and read_post_operation_hook
methods.
Implementing read operation ogging
First, define a class that extends the IOHooks
class:
import Synergex.SynergyDE.IOHooks
namespace Example
class LoggingIOHooks inherits IOHooks
public method LoggingIOHooks
ch, int
parent(ch)
proc
endmethod
protected method read_pre_operation_hook, void
in flags, IOFlags
proc
; Custom logic before a read operation
Console.WriteLine("Pre-read hook: Starting read operation.")
endmethod
protected method read_post_operation_hook, void
inout buffer, a
in flags, IOFlags
inout error, int
proc
; Custom logic after a read operation
Console.WriteLine("Post-read hook: Read operation completed.")
endmethod
endclass
endnamespace
In this class, read_pre_operation_hook
is executed before any read operation, and read_post_operation_hook
is executed after. Both methods log messages to the console.
To use this custom IOHooks class, create an instance and associate it with a file channel:
main
record
channel, int
record rec
somedata, a100
proc
open(channel=0, i:i, "mydatafile.ism")
new BufferedLoggingIOHooks(channel)
; Perform read operations as usual
reads(channel, rec)
; The hooks will automatically log messages before and after the read
close channel
channel = 0
end
When the READS statement is executed, the pre- and post-read hooks are automatically called, adding logging without changing the original read operation.
Global I/O hooks for easier adoption
For an application that wasn’t designed with a single routine for all file opens, consider using global I/O hooks. Inside the global hook, you get enough information to decide if you want to hook the file and potentially determine which hook if you have multiple. To implement global hooks, you need to define them in the SYN_GLOBALHOOKS_OPEN routine:
subroutine SYN_GLOBALHOOKS_OPEN
in channel, n
in modes, OPModes
in filename, a
proc
if(filename == "mydatafile") then
new LoggingIOHooks(channel)
else
nop ; Do nothing, it's not our file
endsubroutine
In this example, after each file is opened, the global hook activates, and the logging functionality is applied to all read operations across the application.
Pitfalls of I/O hooks
While powerful, I/O hooks in DBL execute synchronously with I/O operations. This synchronous execution means that the custom logic in the hooks runs in the same thread and directly affects the performance and behavior of the I/O operation. Misusing I/O hooks can lead to several detrimental effects on an application, such as performance degradation, data inconsistency, or unintended side-effects.
Performance impact
Consider a scenario where you have implemented a read_post_operation_hook
to perform complex data processing or external API calls:
namespace Example
class HeavyProcessingIOHooks inherits IOHooks
public method HeavyProcessingIOHooks
ch, int
parent(ch)
proc
endmethod
protected method read_post_operation_hook, void
inout buffer, a
in flags, IOFlags
inout error, int
proc
; Complex data processing
; or time-consuming external API call
;PerformComplexDataProcessing(buffer)
endmethod
endclass
endnamespace
Using this I/O hook, every read operation on the file waits for the PerformComplexDataProcessing
method to complete. This can significantly slow down the overall performance of your application, especially if this file contains a lot of records that need to be read.
Mitigating performance impacts in I/O hooks
The use of I/O hooks, particularly for recording operations, can have a noticeable impact on application performance. Since I/O hooks methods are invoked synchronously with I/O operations, any additional processing time they introduce directly affects the overall I/O performance. To mitigate these impacts, it’s crucial to optimize the hook methods and consider asynchronous or lightweight logging strategies.
Use lightweight logging Opt for minimalistic logging formats and avoid complex string operations. For instance, record only essential details such as operation type, timestamp, and identifiers.
Buffered logging Instead of writing logs to a file or external system directly within the hook, consider buffering these logs in memory and writing them in batches. If your scenario allows for the potential buffering delay, this reduces the I/O overhead within the hook.
import System.Collections
import Synergex.SynergyDE.IOExtensions
namespace Example
class BufferedLoggingIOHooks inherits IOHooks
private const BUFFER_THRESHOLD, int, 100
protected logBuffer, @ArrayList
; Other hook methods
public method BufferedLoggingIOHooks
ch, int
parent(ch)
proc
endmethod
protected method write_post_operation_hook, void
inout buffer, a
in flags, IOFlags
inout error, int
proc
; Append log to buffer
logBuffer.Add(BuildLogEntry(buffer))
; Periodically flush the buffer
if logBuffer.Count >= BUFFER_THRESHOLD then
FlushLogBuffer()
endmethod
private method BuildLogEntry, @a
buffer, a
proc
;;format the log entry here and return it
mreturn (@a)buffer
endmethod
private method FlushLogBuffer, void
proc
;;write out all of the entries from logBuffer and clear it
endmethod
endclass
endnamespace
Offload to external processes Consider delegating the heavy lifting or external network operations to a separate, asynchronous process or thread. This way, the main I/O operations aren’t blocked by logging activities.
Selective hooking Be selective about which operations to hook. Not all I/O operations might need logging or additional processing. Consider hooking only the operations that require it.
Implement caching If your hooks involve data lookup or processing that can be cached, implement a caching mechanism to avoid redundant operations. For example, cache the results of database lookups or complex calculations.
Keep it simple The logic within each hook method should be as simple and efficient as possible. Avoid unnecessary computations or complex logic within the hooks.
Overview of Select
The Select
class in DBL is a powerful and versatile tool for data retrieval, manipulation, and querying. It offers SQL-like capabilities within the DBL environment, allowing developers to perform complex data operations with ease. The Select
class can be used in conjunction with the FOREACH
statement for iterating over data sets, or individually for more specific data operations.
Key components of Select
-
From class: Specifies the data source for the selection. It defines the file or database table to be queried and the structure of the records.
-
Where class: Used to set conditions for data selection. It allows filtering of records based on specified criteria, similar to the WHERE clause in SQL.
-
OrderBy class: Facilitates sorting of the retrieved data. It can order the data based on one or more fields, either in ascending or descending order.
-
GroupBy class: Enables grouping of records based on one or more fields. This is akin to the GROUP BY clause in SQL, providing a way to aggregate data.
-
Join functionality: Allows the combination of records from multiple data sources, akin to JOINs in SQL. This includes capabilities for inner joins and left outer joins.
-
Sparse class: Offers performance optimization, particularly for networked data access with xfServer. It allows partial record retrieval, reducing data transfer over the network.
An example
For starters, we’re going to need some data. Go grab the employee.txt
and employee.xdl
files from the data
folder in the companion repository. This is a simple text file with some employee data in it and a par file to tell fconvert
some info about our record. We’re going to use this to create a simple ISAM file. To do this, we’re going to use the fconvert
utility. This is a command line utility that comes with Synergy/DE. It’s used for creating and manipulating ISAM files. To create our file, make sure you’re in a dbl command prompt, and then we’re going to run the following command:
fconvert -i1 employees.txt -oi employee -d employee.xdl
It shouldn’t output anything to the screen if it worked correctly, but if you look in your project directory, you should see a new file named employee.ism
. This is our ISAM file. Now that we have our data, let’s write a program to read it. Create a new file named select.dbl
in your project directory and add the following code:
main
record emp_rec
EmployeeID, a5
Name, a30
Department, a20
Salary, d7.2
endrecord
proc
record
channel, i4 ; Channel for file access
fromObj, @From ; From object for Select
whereObj, @Where ; Where object for Select
selectObj, @Select
endrecord
; Open the employee file
open(channel=0, I:I, "employee.ism")
; Create a From object for the employee file
fromObj = new From(channel, emp_rec)
; Define a Where clause (e.g., selecting employees from a specific department)
whereObj = (Where)emp_rec.Department.eq."Sales"
; Create a Select object with the From and Where objects and
; an OrderBy clause (e.g., ordering by Salary)
selectObj = new Select(fromObj, whereObj, OrderBy.Descending(emp_rec.Salary))
; Iterate over the selected records
foreach emp_rec in selectObj
begin
; Process each record (e.g., display employee details)
Console.WriteLine( "ID: " + emp_rec.EmployeeID +
& ", Name: " + emp_rec.Name +
& ", Department: " + emp_rec.Department +
& ", Salary: " + %string(emp_rec.Salary))
end
; Close the file channel
close channel
end
Pretty easy right? You’ve already seen most of this stuff plenty of times before, so let’s just talk about a few of the interesting bits. The Where
clause lets you build your filter criteria using regular DBL syntax. Here are a few more valid Where
snippets to drive that home.
whereObj = (Where)emp_rec.Department == "Sales"
whereObj = (Where)emp_rec.Department == "Sales" && emp_rec.Salary.gt.100000
whereObj = (Where)emp_rec.Department.eq."Sales".and.
& emp_rec.Salary > 100000 || emp_rec.Salary < 50000
Notice that you can use the &&
and ||
operators to build more complex criteria. You can also use the eq
and gt
operators instead of ==
and >
. The choice of symbols or words is really a stylistic choice. As a comparison, the query in our program would look something like this in SQL:
SELECT * FROM employee WHERE Department = 'Sales' ORDER BY Salary DESC
From
The From
object acts as the primary binding link between the data area and the data source. When a From
object is created, it is bound to a specific data area, which is defined by a record, group, or structure within the program. This binding process essentially establishes a mapping between the data structure in the program and the corresponding structure of the data source. You will use this bound data area when defining your Where
and OrderBy
clauses, as well as when accessing the data retrieved from the data source. It may feel slightly odd to have the binding expressed in the From
object and also when iterating over the FOREACH loop, but if you follow the pattern of using the From
object to define the data area and then using the data area in the FOREACH loop, you’ll be fine. The From
object maintains a connection to the original data area throughout its lifecycle, and therefore, the data area you specify must live at least as long as the From
object you create.
Changing things in the data
The Select
class is not read only. If your file channel is opened for update, you can effectively use it to update and delete records. Let’s revisit our example from above and add a few lines to update the salary of all the employees in the sales department. First change the OPEN statement to the following:
open(channel=0, U:I, "employee.ism")
Now add the following code after the FOREACH loop:
; Give a 10% raise
emp_rec.Salary = emp_rec.Salary * 1.10
Select.GetEnum().Current = emp_rec
Now if you compile and run the program, you should see the salaries of all the employees in the sales department increase by 10%. If you keep running it, they will just keep going up! You can also use the Select
class to delete records. This one is the easiest of all. Whatever your Select
criteria is, you just have to call selectObj.Delete()
and it will delete all the records that match your criteria. Let’s write a new program to delete all the employees from the “CORP” department.
import Synergex.SynergyDE.Select
main
record emp_rec
EmployeeID, a5
Name, a30
Department, a20
Salary, d9.2
endrecord
record
channel, i4 ; Channel for file access
fromObj, @From ; From object for Select
whereObj, @Where ; Where object for Select
selectObj, @Select
endrecord
proc
; Open the employee file
open(channel=0, U:I, "employee.ism")
; Create a From object for the employee file
fromObj = new From(channel, emp_rec)
; Define a Where clause (e.g., selecting employees from a specific department)
whereObj = (Where)emp_rec.Department.eq."CORP"
; Create a Select object with the From and Where objects
; and an OrderBy clause (e.g., ordering by Salary)
selectObj = new Select(fromObj, whereObj)
Console.WriteLine("Deleted " + %string(selectObj.Delete()) +
& " employees from the CORP department")
close(channel)
endmain
When you run this program the first time, you should see the following output:
Deleted 12 employees from the CORP department
%DBR-S-STPMSG, STOP
If you run it again, it will delete 0 employees, because there are no more employees in the CORP department. The update and delete operations we’ve just shown would look like this in SQL:
UPDATE employee SET Salary = Salary * 1.10 WHERE Department = 'Sales'
DELETE FROM employee WHERE Department = 'CORP'
There are some important things to keep in mind when using Select
over a channel opened for update. First, you need to be aware that each record you read is going to be locked until you move to the next one or finish the loop. If you don’t intend to update records in the loop, you should use the Select
class over a channel opened for input only. Next you need to know how to handle locked records, because there could be someone else with a lock on a record you’re trying to read.
Events
To handle locked records, you can implement a class that derives from the Synergex.SynergyDE.Select.Event
class. This class has just one member, onLock
, which is called when a locked record is encountered by your Select
object. Let’s take a look at a basic implementation of this class:
; Define a class that extends the Event class for handling locks
class LockHandler extends Event
; Override the onLock method to handle the lock event
public override method onLock, boolean
inout lock, n
inout wait, n
in rec, a
in rfa, a
proc
; set wait to 1 to retry the locked operation
wait = 1
; Return true to retry the locked operation
; or false to throw an exception
mreturn true
endmethod
endclass
This class is pretty simple, but you can do a lot with those four arguments. You can use the lock
and wait
parameters to write out what you want the Select
class to do on retry. You can set wait
to 1 and then return true if you want to retry with a wait similar to WAIT: 1
in a READS statement. Since you get the contents of the record in rec
, you can use that to decide on your course of action. You can use the rfa
argument to get the RFA of the record that was locked. You can either store this RFA somewhere to try again later or offer a list of exceptions to the user, depending on what makes sense in your situation. If you want to skip the locked record, you can set lock
to Q_NO_LOCK
or Q_NO_DELETE
, depending on whether this is a read or a delete operation. You can register your event handler with the Select
object by calling selectObj.RegisterEvent(new LockHandler())
prior to starting your Delete()
or FOREACH loop.
N + 1 problem
The N + 1 query problem is a common performance issue in database operations that is particularly evident in scenarios where a loop over a set of records triggers additional queries for each record. This problem is especially pertinent in an environment using xfServer.
The N + 1 query problem occurs when an application makes one query to retrieve N records from a primary table (the “driving table”) and then iterates over these records, making an additional query for each one to retrieve related data from another table. This results in a total of N + 1 queries (1 for the initial fetch and N for each record).
Consider two related tables: Employees
(the driving table) and Departments
. A typical operation might involve fetching all employees and then, for each employee, fetching data from the Departments
table to get department details.
An initial READS
operation retrieves each employee record from the Employees
table. Inside a loop over these employee records, a READ operation is used to fetch the corresponding department information for each employee from the Departments
table. Take a look at this diagram showing the flow of operations:
The approach described above results in one query for the driving table plus one additional query for each record in the driving table. This means if there are 100 employees, the system performs 1 (initial READS) + 100 (individual READs) = 101 database queries. Such a high number of queries can significantly degrade performance, particularly in scenarios with large datasets or over networked database systems like xfServer. Let’s take a look at the solution to this problem.
Addressing the problem with Select Join
- Efficiency of Joins: The
Select
class withJoin
functionality in DBL offers a more efficient way to handle related data from multiple tables. Instead of multiple queries, a singleSelect
statement withJoin
can retrieve all the necessary data in one go. - Reduced queries: This method significantly reduces the number of queries sent to the database, alleviating the performance issues associated with the N + 1 query problem.
- Optimized data retrieval:
Select
with Join optimizes data retrieval, making it particularly beneficial for applications that require data from multiple related tables.
To run this next example, you’ll need to grab the department.txt
and department.xdl
files from the data
directory in the companion repository. These are the same as the employee.txt
and employee.xdl
files we used earlier, but for the departments table. We’re going to use these to create a second ISAM file. Just like before, make sure you’re in a dbl command prompt, and then we’re going to run the following command:
fconvert -i1 departments.txt -oi department -d department.xdl
Now let’s write a program to read both of these files at the same time. Create a new file named join.dbl
in your project directory and add the following code:
import Synergex.SynergyDE.Select
main
record employee
EmployeeID, a5
Name, a30
Department, a20
Salary, d9.2
endrecord
record department
DepartmentID, a20
Description, a2000
endrecord
record
channelEmp, i4 ; Channel for employee file
channelDept, i4 ; Channel for department file
joinSelectObj, @JoinSelect
rowObj, @Rows
endrecord
proc
; Open employee and department files
open(channelEmp, I:I, "DAT:employees.ism")
open(channelDept, I:I, "DAT:departments.ism")
; Create From objects for both files
fromEmp = new From(channelEmp, employee)
fromDept = new From(channelDept, department)
; Create a Select object with an Inner Join
joinSelectObj = new Select(fromEmp.InnerJoin(fromDept,
& (On)(employee.Department == department.DepartmentID))).Join()
; Iterate over the joined records
foreach rowObj in joinSelectObj
begin
rowObj.fill(employee) ; Fill employee record
rowObj.fill(department) ; Fill department record
; Process the joined records (e.g., display details)
Console.WriteLine( "ID: " + employee.EmployeeID +
& ", Name: " + employee.Name +
& ", Department: " + department.DepartmentID +
& ", Department Description: " + %atrim(department.Description))
end
; Close the file channels
close channelEmp
close channelDept
endmain
The condition (employee.Department == department.DepartmentID
) dictates how the rows from the two tables are matched. It essentially says, “Link each employee to their respective department by matching the department ID in the employee record with the department ID in the department record.” Only those employee records that have a corresponding department entry will be included in the result set. When using Select Join, the record that is being matched must use a key. In this case the department.xdl
file specifies DepartmentID
as a key. If you were writing this query in SQL, it would look something like this:
SELECT *
FROM Employees
INNER JOIN Departments
ON Employees.Department = Departments.DepartmentID;
The Select class offers two types of joins: InnerJoin
and LeftJoin
. Let’s take a look at the differences between these two types of joins.
InnerJoin
- Definition: An
InnerJoin
combines rows from different files where there are matching values in both files. - Behavior: If a row in the first file does not have a corresponding match in the second file, it is excluded from the result set.
- Use case: Use
InnerJoin
when you only want to retrieve records that have matching data in both files. - SQL example:
SELECT * FROM table1 INNER JOIN table2 ON table1.id = table2.id;
- Result: Only returns rows where there is a match in both
table1
andtable2
.
LeftJoin (Left outer join)
- Definition: A
LeftJoin
returns all rows from the left (first) file and the matched rows from the right (second) table. If there is no match,rowObj.Fill(table2)
returnsfalse
. - Behavior: Includes all rows from the left file, regardless of whether they have matches in the right file.
- Use case: Use
LeftJoin
when you want all records from the left file and only the matched records from the right file. For unmatched entries in the left file, therowObj.Fill(table2)
call will returnfalse
.
Key differences
-
Inclusion of unmatched rows:
InnerJoin
only includes rows with matches in both files.LeftJoin
includes all rows from the left file, regardless of whether they have matches in the right file. -
Result set size:
InnerJoin
can result in a smaller result set if many rows don’t have matches across the files.LeftJoin
ensures that all rows from the left file are present in the result. -
Use in data analysis:
LeftJoin
is particularly useful in data analysis and reporting where you need to maintain all records from one dataset while including related records from another dataset when available.
By understanding the differences between InnerJoin
and LeftJoin
, you can more effectively retrieve the necessary data based on the relationships and matching criteria between different ISAM files.
DBMS
You’ve already seen hints of how to create ISAM files in prior sections of this book. In this section, we’ll dive deeper and also look at some of the tools that are available to help you manage your ISAM files.
Creation and definition of ISAM files
The bldism
utility is central to this process, offering a way to create ISAM files from the command line. This can be done through keyboard input, parameter files, or an ISAM definition language (XDL) keyword file.
The ISAM definition language is another mini language for you to learn. Fortunately, it’s pretty simple. XDL is a keyword-driven language where each keyword represents a specific attribute of the ISAM file. For instance, the FILE
keyword indicates the beginning of the XDL description, and NAME filename
specifies the name of the ISAM file you are creating. The ADDRESSING
keyword defines the address length of the file. This choice used to be important, but now you should really just choose 48, which is commonly referred to as a TByte file. PAGE_SIZE
sets the size of the index blocks. You should not choose a number smaller than your native disk block size, which today is almost always exposed as 4096. Understanding these keywords and their impact on the ISAM file structure is vital for effective database management.
Let’s consider an example of creating a basic ISAM file. To begin, you would start with the FILE
keyword to indicate the initiation of an ISAM file definition. Following this, you might specify NAME my_isam_file
, giving your file a unique identifier. The ADDRESSING
attribute could be set to 48
for a large file size capability, with PAGE_SIZE
set to 4096
for optimal indexing performance. You would add additional keywords as required to define the structure and attributes of the file, such as KEYS
to determine the number of keys or RECORD
to define the data records.
An XDL file for a simple ISAM file might look something like this:
FILE
NAME my_isam_file
ADDRESSING 48
PAGE_SIZE 4096
KEYS 1
RECORD
SIZE 100
FORMAT fixed
KEY 0
START 1
LENGTH 10
TYPE alpha
In this example, a file named my_isam_file
is created with 48-bit addressing and a page size of 4096 bytes. It’s defined to have one key, and the record section indicates a fixed format with a size of 100 bytes. The key definition starts at position 1, has a length of 10 characters, and is of alpha type. Choosing the data type of the key changes how the data is ordered in the file. For instance, if the key is numeric, the data is ordered numerically. If the key is alpha, the data is ordered alphabetically.
Here’s a cheat sheet for XDL:
Basic structure
- FILE: Begins the ISAM file definition.
- RECORD: Starts the record definition.
- KEY n: Defines the nth key (where n is the key number).
File attributes
- NAME [filename]: Sets the name of the ISAM file.
- ADDRESSING [32|40|48|NO40|NO48]: Sets the file address length.
- PAGE_SIZE [size]: Sets the size of the index blocks (e.g., 512, 4096, 8192).
- KEYS [num_keys]: Specifies the number of keys (1 to 255).
Optional file attributes
- ERASE_ON_DELETE [yes|no]: Controls record erasure on deletion.
- NETWORK_ENCRYPT [yes|no]: Enables/disables network encryption.
- ROLLBACK [yes|no]: Controls rollback functionality.
- SIZE_LIMIT [max_size]: Sets the maximum file size in MB.
- RECORD_LIMIT [recs]: Sets the maximum number of records.
- TRACK_CHANGES [yes|no]: Enables/disables change tracking.
- TEXT [“text_string”]: Adds text to the file header.
- TEXT_ALLOCATION [text_size[K]]: Allocates space for user-defined text.
- DENSITY [file_density]: Sets the packing density of index blocks.
- RESILIENT [yes|no]: Enables constant synchronization between index and data.
- FULLRESILIENT [yes|no]: Enhances resilience with direct disk writes.
- STATIC_RFA [yes|no]: Indicates if records have constant RFAs.
- STORED_GRFA [yes|no]: Specifies storing of CRC-32 in record header.
- PORT_INT [pos:len]: Defines position and length of nonkey integer data.
Record attributes
- SIZE [rec_size]: Sets the size of the data records.
- FORMAT [fixed|multiple|variable]: Specifies record format.
- COMPRESS_DATA [yes|no]: Enables/disables data compression.
Key definition
- START [pos_1[:pos_n]]: Sets start positions for key segments.
- LENGTH [len_1[:len_n]]: Defines lengths of key segments.
- TYPE [type_1[:type_n]]: Specifies types for key segments (e.g., alpha, integer, decimal).
- ORDER [order_1[:order_n]]: Sets sort order for key segments.
- NAME [key_name]: Assigns a name to the key.
- DUPLICATES [yes|no]: Specifies if duplicate keys are allowed.
- DUPLICATE_ORDER [fifo|lifo]: Sets order for duplicate key records.
- MODIFIABLE [yes|no]: Indicates if the key is modifiable.
- NULL [replicate|noreplicate|short]: Defines null key behavior.
- VALUE_NULL [null_val]: Sets the null value for a null key.
- DENSITY [key_density]: Sets the packing density for the current key.
- SEED [start_value]: Sets the starting value for a sequence autokey.
Usage notes
- Comments start with an exclamation point.
- Keywords and values can be abbreviated but must be distinguishable.
- Use semicolons or carriage returns to separate keywords and values.
- Keywords must follow their section: File keywords after
FILE
, record keywords afterRECORD
, key keywords afterKEY
.
Disaster recovery: Handling data and index corruption in ISAM files
Sometimes, despite your best efforts, things go wrong. In the case of ISAM files, this can mean data corruption. Corruption can occur in the data file, the index file, or both. The isutl
utility is a powerful tool for identifying and recovering from ISAM file corruption. It’s a command-line utility that can be used to perform a variety of operations on ISAM files, including verification, recovery, and maintenance.
Common causes of ISAM file corruption
Forced termination Abrupt termination of programs (e.g., using kill -9
in Unix/Linux or Task Manager in Windows) while they are writing data.
Hardware failures Issues such as disk crashes, power outages, or network disruptions during data operations.
Identifying corruption with isutl
Using isutl -v
- This command verifies the integrity of Revision 4 or higher ISAM files using optimized methods.
- If
isutl
detects potential data access issues, it raises an error and halts the operation. - For a thorough check, the
-z
option performs a linear scan, but it’s more time-consuming and should be used judiciously, typically under the guidance of Synergex Developer Support.
Recovery strategies using isutl
Basic re-indexing (isutl -r) This command re-indexes the ISAM file without altering the data file, including record file addresses (RFAs). It’s the first step in recovery, especially when no data corruption is evident.
Extended recovery (isutl -r with -d) When corruption in the data file is confirmed or suspected, combining -r
with -d
ensures a more rigorous recovery process. It involves low-level checking and validation of the data file during recovery. If you have SGRFA enabled, this option also checks the CRC-32 checksums in the record headers.
Preventative measures and best practices
- Regular backups: Frequent and systematic backups are essential for disaster recovery.
- Avoid abrupt termination: Ensure safe shutdown of applications and databases to prevent data corruption.
- Hardware and network reliability: Invest in reliable hardware and stable network infrastructure.
- Monitor system health: Regularly check for signs of disk or network issues.
- Access control: Limit direct file access to trained personnel and use application-level interfaces for database interactions.
- Regular maintenance: Use tools like
isutl
for routine checks to identify and fix minor issues before they escalate. - Disaster recovery plan: Have a clear, tested plan in place for different scenarios of data loss or corruption. If you don’t execute the plan from time to time, you don’t really have a plan.
By understanding the causes of ISAM file corruption and using isutl
effectively for both identification and recovery, you can significantly mitigate the risks associated with data loss. Coupled with preventative strategies and best practices, these measures provide a robust framework for managing ISAM file integrity and ensuring data reliability in DBL environments.
Stronger data integrity
The RESILIENT
and FULLRESILIENT
options in DBL offer enhanced safety for file operations, particularly in environments where data integrity is paramount. However, these options come with trade-offs in terms of performance and system resource utilization.
RESILIENT option
- Implementation: The
RESILIENT
option ensures that during file updates, such as writing or deleting records, there is a constant synchronization between the data file and the index file. This feature enhances the integrity of ISAM files by ensuring that both index and data are consistently in sync. - Trade-offs:
- Performance impact: There is a moderate performance overhead due to the additional synchronization steps. This might affect high-throughput scenarios where speed is critical.
- System resources:
RESILIENT
increases use of system resources to maintain the synchronization, though typically not to a degree that would significantly impact overall system performance.
- Safety offered:
RESILIENT
provides a robust safeguard against data inconsistencies, especially in scenarios where a system crash or abrupt termination could leave the data in an inconsistent state.
FULLRESILIENT option
- Implementation:
FULLRESILIENT
builds upon theRESILIENT
feature. In addition to maintaining synchronization between data and index files, it ensures that all write operations are immediately flushed to the disk. This is achieved using specific file open flags:FILE_FLAG_WRITE_THROUGH
on Windows andO_DSYNC
on UNIX. - Trade-offs:
- Performance impact: FULLRESILIENT can significantly slow down disk access. Every write operation incurs the overhead of ensuring data is written to the disk, which can be a bottleneck in write-intensive applications.
- System resources: The demand on system resources is higher compared to the RESILIENT option due to the continuous disk write operations.
- Safety offered: This option offers the highest level of data safety. In the event of a system failure,
FULLRESILIENT
minimizes the risk of data corruption or loss, as all changes are immediately committed to the disk.
Making the right choice
- Assessing needs: The choice between
RESILIENT
andFULLRESILIENT
should be based on the specific requirements of the application and the environment in which it operates. For applications where data integrity is more critical than performance,FULLRESILIENT
is suitable. In contrast, if the application demands higher throughput and can tolerate a slight risk of data inconsistency in the event of a failure,RESILIENT
might be the better choice. - Understanding the environment: The impact of these options also depends on the underlying hardware and system architecture. Systems with faster disk write capabilities and high-performing I/O subsystems can mitigate some of the performance penalties associated with
FULLRESILIENT
. - Balancing trade-offs: It’s essential to balance the trade-offs between data safety and application performance. In many cases, a thorough testing and analysis phase might be required to understand the impact of these options on the overall system performance and data integrity.
In summary, RESILIENT
and FULLRESILIENT
offer different levels of safety for data operations in DBL. While RESILIENT
ensures synchronization between data and index files, FULLRESILIENT
goes a step further by writing changes immediately to the disk. The choice between them should be guided by the specific data integrity requirements and performance expectations of the application.
Validating the data in your ISAM files against your repository
Ensuring the integrity of the data stored in your ISAM files is an important part of maintaining a healthy DBL application. We’re going to dig into some of the specifics of what that means and how to do it.
Ensuring data integrity: The repository holds metadata that defines the structure of database files, including field types, lengths, and key definitions. Validating ISAM files against this metadata ensures that the data in these files adheres to the expected structure. At some point in the past, an application error may have allowed invalid data to be written to the ISAM file. Regular validation helps identify and correct these issues.
Consistency across applications: In environments where multiple applications or services interact with the same ISAM files, it’s important that they all have a consistent understanding of the file structures. Validating against a repository ensures that any application relying on these files will interpret and manipulate the data correctly, preventing issues caused by mismatched data formats or structures. Using the data from your ISAM files directly in xfODBC or a Harmony Core web service can bring this invalid data to the surface as crashes or unexpected results.
Facilitating database updates and migration: When upgrading or migrating databases, it’s essential to ensure that the existing data is compatible with new structures or formats. Validation against a repository can identify areas that require transformation or adjustment, facilitating a smoother migration or upgrade process.
Simplifying troubleshooting: In case of database-related issues, having validated and consistent data simplifies the troubleshooting process. It’s easier to isolate problems when you can rule out data structure inconsistencies as the cause.
Using fcompare to validate your ISAM files
The fcompare
utility can compare ISAM file definitions with repository file definitions, ensuring that the structure and content of your ISAM files is in sync with the metadata defined in your repository. Here’s the basic workflow for using fcompare
:
Setting up the repository files: First, determine the repository main and text files you want to use for the comparison. These files contain the metadata against which the ISAM file will be checked. You can specify these files using the -r
option followed by the paths to the rpsmain
and rpstext
files.
Specifying the ISAM file to check: Use the -f
option to specify the name of the ISAM file you want to compare. This name should correspond to a specific repository file definition. fcompare
will check all structures assigned to this file.
Enabling data verification: If you want to perform a thorough validation that includes checking the data within the ISAM file, use the -dv
option. This mode ensures that date and decimal fields in your ISAM file contain valid values, based on their field types.
Logging the output: To capture the results of the comparison, you can specify a log file using the -l
option. This is useful for recording discrepancies and later analysis.
Choosing the level of detail in messages: You can control the level of detail in the messages fcompare
generates. Use -i
for error, warning, and informational messages, or -v
for just error and warning messages. It’s best to use the verbose option -v
for detailed insights, especially if discrepancies are expected.
Executing the command: Once you’ve set the appropriate options, run the fcompare
command with the chosen parameters. For example:
fcompare -r path_to_rpsmain:path_to_rpstext -f file_def_name -dv -l log_file -v
Replace path_to_rpsmain
, path_to_rpstext
, file_def_name
, and log_file
with your actual file paths and names.
Interpreting the results: After running the command, review the output in the console or the specified log file. Fcompare
will list discrepancies between the ISAM file and the repository definitions. These could be differences in field types, lengths, key definitions, etc. Errors indicate necessary fixes to align the ISAM file with the repository, while warnings suggest potential performance issues.
Resolving discrepancies: If fcompare
identifies any discrepancies, you’ll need to update either the ISAM file or the repository definitions to resolve these differences. The specific changes will depend on the nature of the discrepancies found.
TODO: Specific example with bad isam data, an XDL and a repository file.
Back to the Bank: Persisting Data with ISAM
Searching
Advanced Memory Management
Advanced memory management in DBL provides powerful techniques for optimizing memory usage, improving application performance, and enabling dynamic data handling in complex systems. In this chapter, we’ll discuss core components of this functionality, such as memory handles and mapping as well as overlays and ranging. These techniques are designed to give developers fine-grained control over how memory is allocated, accessed, and organized, and they’re especially important when working with large datasets, shared resources, or memory-constrained environments.
-
Memory handles provide a dynamic approach to memory allocation and management, allowing you to create and manage memory segments as needed. By using memory handles, applications can allocate and release memory during runtime, improving flexibility and enabling more efficient use of system resources.
-
Mapping enables the flexible association of logical memory regions with physical memory addresses, which promotes efficient memory sharing. With mapping, you can create a seamless interface between persistent and volatile data, improving both performance and modularity.
-
Overlays facilitate memory reuse by allowing multiple data structures or variables to occupy the same memory space at different times. Overlays are particularly valuable for optimizing fixed memory resources in embedded systems or legacy applications.
-
Ranging allows for dynamic partitioning of memory by defining specific ranges for data storage. This technique ensures that applications can efficiently use available memory by segmenting it into manageable regions, reducing fragmentation and improving data locality. You can manipulate these ranges to isolate critical data or dynamically adjust memory boundaries at runtime.
Together, these tools form the foundation of effective memory management in DBL, enabling developers to craft solutions that are not only functional but also optimized for peak performance.
Ranging and Overlays
When defining fields within a record, group, or structure, the default behavior is to sequentially allocate memory for each new field, starting from where the previous field ends. But in some cases, you might need to interpret existing data from a fresh angle, map it to a different type, or treat a set of fields as a single entity. You can achieve this by overlaying memory, that is, specifying a field to start at a particular location within the data structure. To do this, you employ a position indicator (@) in your field definition.
It’s important to be aware that memory overlays can only be applied to fixed-size data types. These include alpha (a), decimal (d), implied decimal (id), and integer (i) types. In addition to these basic types, complex data structures can also be overlaid, but only if they are composed exclusively of these fixed-size data types.
Memory overlays are particularly valuable when dealing with records that represent different subtypes. Often, the first few bytes of a record indicate its specific subtype. By using memory overlays, you can manage this scenario effectively, delivering customized views of your data according to its subtype.
The area referred to by the position indicator starts at the specified offset within the record or group. If the field extends beyond the record being overlaid, you will encounter a compilation error: “Cannot extend record with overlay field.” However, if the field extends beyond a non-overlaid record and doesn’t overlay another field, DBL extends the record to fit the newly defined field.
To use a position indicator, the literal record must be named, or the field must exist in a GROUP/ENDGROUP block. There are two forms of indicating a field’s starting position: absolute and relative.
The absolute form involves specifying a single decimal value to represent the actual character position within the record or group where the field begins. For example, @2
indicates that the field begins at the second character position.
record title
id,a* @1, "Inventory ID"
onhand,a* @20, "Qty On-hand"
commtd,a* @35, "Committed"
order,a* @50, "On Order"
record line
id ,a12 @1
onhand ,a10 @20
commtd ,a9 @35
order ,a8 @50
proc
line.id = "Sausage Roll"
Console.WriteLine(title.id)
Console.WriteLine(line.id)
line.id = "Sausage Roll but too long of a title"
Console.WriteLine(line)
Output
Inventory ID Sausage Roll Sausage Roll
You can see from the output that the declared size and type of fields is still respected when storing data even though the lin.id field is declared as an absolute overlay.
The relative form specifies the starting position relative to another record or field using the syntax name[+/–offset]
. Here, name
represents a previously defined record or field, and offset
indicates the number of characters from the beginning of the name
at which the field begins. The offset can be preceded by a plus (+) or a minus (–).
Let’s consider some examples:
record contact
name , a25
phone , a14, "(XXX) XXX-XXXX"
area , d3 @phone+1
prefix , d3 @phone+6
line_num , d4 @phone+10
In contact
, phone
references the entire 14-character field. Furthermore, area
references the second through fourth characters of phone
, prefix
references the seventh through ninth characters, and line_num
references the last four characters.
You may want to create an overlay, or a new view, on top of an entire existing data structure. This could be a group, record, or common. This can be achieved using the ,X syntax during declaration. When used, ,X
indicates that you’re creating an overlay for the most recent non-overlay entity of the same type. For instance, if you’re declaring a new record with ,X, it will overlay the last defined non-overlay record. Similarly, it works for groups, and commons. However, there’s a special case for external commons—if you use ,X while declaring an external common, it will be ignored. Here’s the general format of declaring an overlay
record interesting_definition
aKey, a10
aBlob, a50
endrecord
record inventory_layout, x
id, a10
description, a20
quantity, d20
something_else,a10
endrecord
record customer_layout, x
id, a10
name, a20
address,a30
endrecord
proc
inventory_layout.description = "some kind of item"
inventory_layout.quantity = 5000000
Console.WriteLine(inventory_layout.quantity)
Console.WriteLine(customer_layout.address)
Output
5000000 00000000000005000000
You can see from the output that interpreting a decimal field as an alpha can lead to unexpected results, but it is very common for DBL projects to feature this kind of layout. Remember, when adding fields to a record with overlays, you must ensure that the absolute field positions haven’t changed. Also, the field specified in the position indicator should have been defined before, and the specified record for overlaying should be the one being overlaid. This memory overlaying feature in DBL allows for efficient memory usage and convenience when working with fixed layouts.
Absolute ranging
A ranged reference in DBL allows you to specify a range of characters within a variable’s data space. This range is specified in parentheses following the variable reference and represents the starting and ending character positions relative to the start of the variable’s data. The value of a ranged variable is the sequence of characters between, and including, these start and end positions. If the specified range exceeds the limit of the first variable, the range continues onto the next variable.
Ranging is possible on real arrays. An example would be my_alpha_array[1,2](1,2)
. However, subscripted arrays cannot be ranged, and trying to do so, such as my_alpha_array(4)(1,2)
, will produce an error.
If the variable being ranged is of implied-decimal type, the value is interpreted as a decimal. Please note that ranging is not permissible beyond the defined size of class data fields, and records or when the -qcheck compiler option is specified.
Absolute ranging specifies the start and end character positions directly. The format for this type of ranging is variable(start_pos,end_pos)
.
Relative ranging
Relative ranging allows you to specify a range in terms of a start position and length, or an end position and length backward from that end position. This is done using two numeric expressions separated by a colon in the format variable(position:length)
.
If the length is positive, the specified position is the starting point, and the length extends forward from this point. The value of the ranged variable is the sequence of characters that begins at the start position and spans the specified length.
Conversely, if the length is negative, the specified position is treated as an end point. The length then extends backward from this end position. In this case, the value of the ranged variable is the sequence of characters that starts the specified number of characters before the end position and ends at the end position.
This relative ranging mechanism provides you with flexible ways to interpret your data, whether you want to view it from a specified starting point forward or from a specified ending point backward.
Range restrictions
In Traditional DBL, referencing beyond the defined area up to the end of the data area is permitted if your dimension specification exceeds the declared size of its corresponding array dimension. This technique, known as “over subscripting,” “subscripting off the end,” or “over ranging,” is not allowed in .NET. Moreover, it’s also disallowed for class data fields in Traditional DBL or when the
-qstrict
or-qcheck
options are specified.While over subscripting might seem like a handy tool, it can introduce a significant number of bugs in production. Therefore, employing the
-qcheck
option for all development environments and-qstrict
for all production builds is highly recommended to prevent these issues.It’s essential to note that these rules generally apply to fixed-size data types, such as alpha (a), decimal (d), implied decimal (id), and integer (i), as well as complex structures composed solely of these fixed-size data types. Interestingly, despite not being a fixed-size type, strings can also be ranged. The compiler handles this by interpreting the contents of the string as if it were an alpha type, which allows for flexible manipulation within the defined range.
Here’s an example showing absolute and relative ranging.
record demo
my_d_array ,[3,2]d2 , 12 ,34,
& 56 ,78,
& 98 ,76
my_a_array ,[2,4]a3, "JOE" ,"JIM" ,"TED" ,"SAM",
& "LOU" ,"NED" ,"BOB" ,"DAN"
proc
Console.WriteLine(demo.my_a_array[1,1](1, 2))
Console.WriteLine(demo.my_a_array[1,1](3:1))
Output
JO E
Mapping
Memory handles and mapping structures allow for dynamic memory management but also some pretty unsafe hacks. They provide a way to create a structured view over a block of bytes, making it easier to manipulate complex data structures. Let’s break down these concepts and how you might see them show up in your codebase.
Memory handles
- Memory allocation: DBL uses memory handles to refer to dynamically allocated memory blocks. These blocks can vary in size and are identified by numeric handles.
- %MEM_PROC function: %MEM_PROC allows for allocating (
DM_ALLOC
), reallocating (DM_RESIZ
), and freeing (DM_FREE
) memory. Additionally, you can query the size of a memory block (DM_GETSIZE
) or register external memory segments (DM_REG
).DM_REG
is probably a weird one to see in the wild, but it’s needed if you want to work with memory allocated into a raw pointer from another language, such as C.
Mapping structures with ^M and ^MARRAY
- ^M operator: This operator “maps” a structure to a specified data area or memory handle. It aligns the first character of the data area with the start of the structure and creates a field descriptor that references this data. If the mapped field exceeds the bounds of the data area, an error is triggered.
- ^MARRAY function: ^MARRAY extends the concept of ^M to arrays. It maps a field or structure over a memory handle multiple times, returning an array of descriptors. The size of the allocated area determines the number of elements in the array. However, the array’s lifetime should not exceed that of the handle.
Mapping over an alpha
That weird pattern where you have a local structure that contains an a1 field and you overindex it to whatever size you want at runtime
TODO What’s this stuff used for in the wild?
History lesson
TODO How can you use dynamic arrays, ArrayLists, and structures to do this better
Samples and diagrams
-
Sample programs:
- Basic memory allocation: Demonstrate allocating a memory block and accessing it with a numeric handle.
- Structure mapping: Show how to map a structure over a memory block, modify the structure, and reflect these changes in the memory.
- Array mapping with ^MARRAY: Illustrate creating an array of structures or fields using ^MARRAY and accessing individual elements.
-
svgbob graphics:
- Memory block visualization: An ASCII diagram showing a memory block and its handle, illustrating the concept of memory allocation.
- Structure mapping diagram: A graphic representing the alignment of a structure to a memory block using ^M, showing the start of the structure aligned with the beginning of the memory block.
- Array mapping illustration: A diagram depicting how ^MARRAY maps a structure over a memory block, showing the repeated instances of the structure within the allocated memory.
Quiz Answers
Common Patterns
TODO: Add content
Anti-Patterns
TODO: Add content
Inconsistent globals
Multiple definitions of the same routine
Big Ball Of Mud
The “big ball of mud” is a term used in software engineering to describe a system that lacks a discernible architecture. It is often considered an anti-pattern, representing a codebase that has grown haphazardly and chaotically over time, resulting in a monolithic structure where everything is entwined into a single, tightly-coupled entity. This often occurs when rapid growth and immediate functionality are prioritized over sustainable design and future scalability.
In the DBL environment, this pattern is frequently exhibited through extensive use of GLOBAL and COMMON. These large collections of global data variables can become sprawling, where understanding the interplay and dependencies among the variables can be a daunting task. As a result, the system becomes progressively more entangled, complex, and interconnected, resembling a big ball of mud.
This approach results in an overly coupled system where modules depend on shared state, and changes in one module can lead to unpredictable effects in other modules due to this shared global state. The extensive interconnectedness of components makes maintenance and updates increasingly difficult. The challenge is further amplified when the system needs to scale, as the tightly coupled nature of the system and its global state can hinder concurrency and parallel processing. Thus, the “big ball of mud” pattern, with its reliance on global data, often leads to a stifling of innovation, slow development speed, and an increase in the risk of bugs and system failures.
Refactoring a “ball of mud” into a better-organized codebase is a significant effort, but it can pay off in the long run with increased maintainability and scalability. Here are some steps you can take:
-
Identify and isolate business logic: The first step in refactoring is to identify key business processes in your code. Once these are found, they should be isolated and separated from unrelated components. This helps reduce dependencies between different parts of your code.
-
Adopt modular design principles: Aim for a design where each module has a single, well-defined responsibility. Such a modular design makes it easier to modify or replace individual modules without affecting the rest of the codebase.
-
Enforce strict boundaries between components: Use interfaces, abstract classes, or similar constructs to enforce boundaries between different components. This ensures that each component communicates with others only through well-defined channels.
-
Incremental refactoring: Instead of trying to refactor the entire codebase at once, prioritize the most problematic areas and tackle them first. Then gradually refactor other areas as and when required. This ensures the refactoring process doesn’t disrupt regular development work too much.
-
Automate testing: Automated testing is crucial when refactoring. This ensures that you haven’t introduced any regressions in the process. It also enables you to make future changes with confidence.
-
Continuous education: Ensure your development team is educated about good design principles and the value of a clean, well-structured codebase. This encourages everyone to maintain good practices as they continue to develop new features.
Moving from a monolithic “ball of mud” structure to a more modular and maintainable architecture is not an easy task. However, with a thoughtful approach and dedication to good design principles, it is certainly achievable and worthwhile in the long run.
Circular References
Circular references manifest when two or more entities—whether they be functions, modules, or data structures—depend on each other either directly or indirectly, thus creating a loop of dependencies. This phenomenon is particularly noticeable in procedural programming, where the flow of execution revolves around procedures or functions and modules that contain related functions. At a higher level, they can manifest between libraries or projects, where one library depends on another, which in turn depends on the first. This kind of circularity at the library or project level can pose significant challenges for building, maintaining, and deploying software.
Why circular references should be avoided
-
Initialization issues: In procedural systems, the sequence of operations matters immensely. Circular dependencies can create ambiguity in the sequence in which functions or modules should be executed. If function A relies on function B, but function B in some way relies on A, then deciding which to run first becomes a challenge.
-
Maintenance difficulty: Code with circular references tends to be more intricate and harder to decipher. If you change one function that’s part of a circular dependency, it’s easy to inadvertently affect others in the loop, potentially introducing unforeseen bugs.
-
Build and deployment challenges: When two libraries reference each other, building or deploying them in isolation becomes problematic. Determining which library should be built, tested, or deployed first is ambiguous when each relies on the other.
-
Tight coupling: This is a core concern with circular references. Tight coupling refers to when different parts of a program are interdependent, making them reliant on each other’s detailed implementation. The dangers of tight coupling include
- Reduced flexibility: Changes in one function might necessitate changes in another due to their close interdependence.
- Reduced readability: New developers or even the original programmers might find it difficult to understand the flow and logic of tightly coupled code, especially after some time has passed.
- Reduced reusability: When functions are tightly coupled, extracting one for use in a different context becomes challenging since it’s intertwined with other functions.
- Increased fragility: A change or a bug in one function can have cascading effects throughout the system, making the program more prone to errors and harder to debug.
Techniques to refactor circular references in procedural code
-
Reorder operations: Evaluate if the operations in the dependent functions can be reordered to break the cycle. Sometimes, resequencing tasks within functions can prevent the need for cyclic calls.
-
Pass data instead of making calls: If one function is calling another just to get some data, consider if that data can be calculated beforehand and passed as an argument instead.
-
Restructure libraries or modules: Whether dealing with intricate operations in procedural code or dependencies between libraries, extracting the common elements causing circularity can be a strategic move. By isolating these elements into separate helper functions, utility modules, or shared components, multiple functions or libraries can then reference this central entity. This approach not only breaks the circular dependencies but also emphasizes a reliance on shared abstractions or utilities rather than direct, intertwined dependencies.
-
Dependency inversion: Instead of one library depending on another, both libraries can be made to depend on a third abstraction or a shared component. This inverts the dependencies, ensuring that higher-level libraries are not dependent on lower-level details but rather on abstractions.
-
Reevaluate design decisions: At both the procedural and library levels, circular references often hint at underlying design issues. Whether it’s the architecture, responsibilities, relationships of components, or the logic flow within a program, such circularities may indicate the need for a design review. Revisiting the overall structure, architecture, and design of the system can help identify and implement more linear or hierarchical structures, ensuring smoother operations and reducing dependencies.
God Object
Super routines
At times, developers tackling a set of closely intertwined tasks might create a super routine, wherein the subfunction executed hinges on a provided parameter. Instead of having distinct routines for each task, these “multi-task” routines use the parameter to branch internally and either execute large blocks of code directly or CALL the appropriate local routine. While this approach can sometimes simplify the calling interface or reduce code repetition, it can also lead to routines that are harder to maintain and understand. Such routines can inadvertently violate the single-responsibility principle, as they take on multiple responsibilities based on parameter values. Over time, as more subfunctions are added or existing ones are modified, the complexity of these parameter-driven routines can increase, posing challenges in testing, debugging, and ensuring consistent behavior across all parameter values. Instead of a single routine handling multiple tasks based on a parameter, developers can create separate, well-named functions for each task. This promotes clarity and makes it easier to test and maintain each function independently. Here are some alternatives to consider if simply using separate functions won’t do the trick.
-
Strategy pattern: This is an object-oriented design pattern that can be used to switch between different algorithms or strategies at runtime. Rather than using a parameter to select a branch of code, different strategies are encapsulated as separate objects that can be plugged into a context object as needed.
-
Command pattern: Instead of determining the function to run based on parameters, you can encapsulate each function within a command object and invoke it using a consistent interface. This is particularly useful when you need to queue operations, support undo/redo operations, or decouple the requester from the operation itself.
-
Lookup tables or dictionaries: Instead of branching logic based on parameters, map functions or methods to keys in a lookup table or dictionary. This way, the desired function can be retrieved and executed dynamically based on the provided key, ensuring a cleaner and more efficient structure.
-
Higher-order functions: In .NET, functions can be treated as first-class citizens, allowing them to be passed as arguments or returned as values from other functions. Instead of relying on parameters for branching, functions can be passed around to achieve the desired behavior.
-
Factory pattern: For situations where you need to instantiate one of several possible classes based on input, the factory design pattern provides a mechanism to create objects without specifying the exact class to be created. This encapsulates the instantiation logic and can replace the need for super routines in object creation scenarios.
Dirty Builds
TODO: Expand dirty vs clean builds and explain additive style builds in more detail The ability to additively link or replace routines over time using DBOs, ELBs, and OLBs has historically facilitated quick debugging and streamlined deployment. In more recent times with higher-speed internet connections and more powerful systems, these benefits have ceased to outweigh the considerable downsides.
Additive library builds
-
State accumulation in ELBs: As ELBs and OLBs evolve with routines being linked in or replaced, they can accumulate states, including outdated configurations or remnants of previous routines. This leftover state can introduce unexpected behaviors or bugs that might not manifest in a cleanly built library.
-
Version control difficulties: While traditional version control systems track changes in source files, the additive model of ELBs might complicate versioning. It can become challenging to discern specific changes between ELB versions or to merge alterations from different developers.
-
Dependency challenges: Over time, as more routines are added or replaced in an ELB, it can be difficult to discern which parts are actively used and which have become obsolete or redundant. While not directly causing memory overhead, these lingering routines can introduce potential conflicts, bugs, or increased complexity.
-
Onboarding new developers: For developers unfamiliar with the DBL environment and the specific history of an additively built library, there can be a learning curve. Comprehending the intricacies and evolution of a library, developed over an extended period, can be challenging.
-
Deployment variability: While some users argue that the DBL system simplifies deployment by allowing for tiny patches, this approach can lead to environments with slightly different library states, especially if patches are applied inconsistently. This variability can make debugging and troubleshooting harder since not all environments are guaranteed to be identical.
Today you can use MSBuild-based projects to get highly controlled, well-structured builds that produce the same artifact for a clean build or an incremental one. Depending on your project structure, this process can actually be even faster than manual additive builds using custom scripts.
Beyond Types
In this chapter we delve into advanced programming concepts that transcend traditional type-based programming, unlocking a new expressiveness in your code. This chapter is dedicated to understanding and mastering generics, lambdas, extension methods, and functional programming styles—key tools that elevate the way you write and think about code. We will explore generics for creating highly reusable and type-safe components, lambdas for concise and flexible function definitions, and extension methods for augmenting existing types. Additionally, we’ll try to give you some bit-sized tools to embrace the elegance of functional programming, weaving these concepts seamlessly into your programming style. Crucially, the end of this chapter focuses on integrating these advanced techniques into legacy code, guiding you through the process of modernizing and enhancing existing codebases without disrupting their fundamental architecture. By the end of this chapter, you’ll be equipped with a deeper understanding and practical skills to elegantly tackle complex programming challenges.
Generics
.NET generics in DBL introduce a powerful way to create flexible, reusable, and type-safe code structures. At their core, generics allow developers to define classes, structures, methods, delegates, and interfaces that can operate on a variety of data types without specifying the exact data type during the time of their creation. This capability is not just a convenience but a significant enhancement in programming with DBL.
One of the key advantages of generics is the promotion of type safety. By using generics, developers can create data structures that are inherently safe at compile-time, reducing runtime errors related to type mismatches. This aspect is particularly beneficial when dealing with collections, where enforcing a consistent data type across all elements is crucial for reliability and maintainability of code.
Moreover, generics contribute to code optimization and reduction of redundancy. Prior to generics, developers often had to create multiple versions of classes or methods to handle different data types, leading to code duplication or expensive runtime checking and casting. With generics, a single class or method can cover a range of data types, making the codebase more concise and easier to manage. This not only streamlines development but also makes the code more readable and maintainable.
Syntax
Declaring generic types involves a straightforward syntax that closely resembles the conventional class declaration process, with the addition of type parameters. These type parameters, typically denoted as <T>
, allow the class to handle various data types without specifying the exact type during the class definition. Here’s how you can declare generic types in Synergy .NET:
To declare a generic class, use the standard class declaration syntax but include a type parameter within angle brackets (<>
). This type parameter acts as a placeholder for the actual data type that will be used when an instance of the generic class is created.
The basic syntax is as follows:
namespace Example
class ClassName<T>
; Class members go here, using T as a type
endclass
endnamespace
In this syntax, ClassName
is the name of the class, and T
is the generic type parameter. The letter T
is conventionally used, but you can use any valid identifier.
Examples of generic class declarations
A simple generic class
namespace Example
class MyGenericClass<T>
public field, T
endclass
endnamespace
In this example, MyGenericClass
is a generic class with a single type parameter T
. The class contains a public field of type T
.
Generic class with multiple type parameters
namespace Example
class Pair<T, U>
public first, T
public second, U
endclass
endnamespace
Here, Pair
is a generic class with two type parameters, T
and U
. It can be used to store a pair of values of potentially different types.
Constraints
Constraints in generics are a pivotal feature in Synergy .NET, allowing developers to specify limitations on the types that can be used as arguments for type parameters in generic classes, methods, interfaces, or structures. These constraints provide a way to enforce type safety and ensure that generic types behave as expected. There are primarily three types of constraints in Synergy .NET generics: base class constraints, interface constraints, and the new
keyword constraint.
Base class constraints
A base class constraint restricts a type parameter to a specified base class or any of its derived classes. By default, if no base class constraint is specified, the type parameter can be any class since it implicitly inherits from System.Object
.
Syntax:
namespace Example
class MyClass<T(BaseClass)>
; Class members using T
endclass
endnamespace
In this example, T
is constrained to BaseClass
or any class derived from BaseClass
.
Interface constraints
Interface constraints restrict the type parameter to classes that implement one or more specified interfaces. This ensures that the type argument provides the functionality defined in the interface, allowing the generic class to utilize these capabilities.
Syntax:
namespace Example
class MyClass<T(Interface1, Interface2)>
; Class members using T
endclass
endnamespace
Here, T
must be a type that implements both Interface1
and Interface2
.
The “new” keyword constraint
The new
constraint specifies that a type argument must have a public parameterless constructor. This is particularly useful when you need to create instances of the type parameter within your generic class.
Syntax:
namespace Example
class MyClass<T(new)>
public method CreateInstance, T
proc
return new T()
endmethod
endclass
endnamespace
In this class, T
is constrained to types with a public parameterless constructor, allowing the CreateInstance
method to create a new instance of T
.
Combining constraints
Synergy .NET also allows you to combine these constraints to create more specific and controlled generic definitions.
Syntax:
namespace Example
class MyClass<T(BaseClass, Interface, new)>
; Class members using T
endclass
endnamespace
In this case, T
must be a type that inherits from BaseClass
, implements Interface
, and has a public parameterless constructor.
Practical usage of constraints
Using base class or interface constraints in generics is a strategic choice for developers aiming to enhance the robustness, safety, and clarity of their code. These constraints serve several important purposes:
Enforcing type safety
By specifying a base class or interface constraint, developers ensure that the generic type parameter adheres to a certain “shape” or set of behaviors. This guarantees that the generic class or method can safely invoke the methods and properties defined in the base class or interface, without the risk of runtime errors due to type mismatches. The alternative would be to use the System.Object
type, which allows any type to be used as an argument for the type parameter. However, this would require the developer to perform runtime checks and casts to ensure that the type parameter is compatible with the methods and properties used within the generic class or method.
Leveraging polymorphism
Base class and interface constraints enable polymorphism in generics. A generic class constrained to a specific base class can work with any subclass of that base class, thus benefiting from polymorphism. This allows the generic class to handle a variety of related types in a uniform way.
Utilizing interface-defined capabilities
When a generic type is constrained to an interface, it can utilize the capabilities that are guaranteed by that interface. This is particularly useful when the generic class needs to perform operations that are specific to that interface, such as iterating over a collection (if the interface defines enumeration behavior), without needing to know the exact type of the objects it’s dealing with.
Designing flexible and reusable code
Constraints allow for the creation of flexible and reusable generic classes and methods. They enable developers to write code that is abstract and general enough to handle different types but also specific enough to enforce certain characteristics of these types. This balance between abstraction and specificity leads to more versatile and maintainable code.
Enhancing code readability and clarity
Using constraints clarifies the developer’s intent and the design of the generic type. It makes it clear to other developers what kinds of types are expected or allowed, improving the readability and maintainability of the code.
Preventing inappropriate usage
Constraints prevent the misuse of a generic type by restricting the type parameters to suitable types. This reduces the likelihood of runtime errors and ensures that the generic type is used as intended.
Facilitating code validation and debugging
With constraints in place, many potential errors can be caught at compile-time rather than at runtime. This early detection makes debugging and validation of the code easier, as the compiler can provide clear guidance on the proper use of the generic types.
Examples of generic class constraints
Generic class with a type parameter constraint
namespace Example
class MyClass<T(SomeBaseClass)>
public myField, T
endclass
endnamespace
This declaration shows a generic class MyClass
with a type parameter T
that is constrained to be a subclass of SomeBaseClass
. This ensures that T
inherits from SomeBaseClass
, allowing you to use methods and properties of SomeBaseClass
within MyClass
.
Complex constraints
namespace Example
class ComplexClass<T(class1, iface1, iface2, new), S(class2), K(iface3), X(new)>
; Class members using T, S, K, X
endclass
endnamespace
In this more complex example, ComplexClass
has multiple type parameters (T
, S
, K
, X
), each with different constraints. T
is constrained to types that inherit from class1
and implement both iface1
and iface2
interfaces, and it must have a public parameterless constructor (new
).
Constructing generic classes
Using generic classes involves constructing these classes with specific type arguments that conform to the defined constraints and rules of the generic class. This process allows developers to leverage the power and flexibility of generics while adhering to type safety.
To use a generic class, you construct or instantiate it by providing actual type arguments in place of the generic type parameters. This creates a constructed type, tailored to the specified type arguments.
Instantiating a simple generic class
Suppose you have a generic class MyGenericClass<T>
:
namespace Example
class MyGenericClass<T>
public field, T
endclass
endnamespace
You can instantiate this class with a specific type, such as int
:
record
myInstance, @MyGenericClass<int>
proc
myInstance = new MyGenericClass<int>()
Here, myInstance
is an instance of MyGenericClass
specifically constructed for the int
type.
Multiple type parameters
For a class with multiple type parameters like Pair<T, U>
:
namespace Example
class Pair<T, U>
public first, T
public second, U
endclass
endnamespace
You can instantiate it with two types:
record
myPair, @Pair<int, string>
proc
myPair = new Pair<int, string>()
In this case, myPair
is an instance of Pair
with int
for T
and string
for U
.
Restrictions on type arguments
When instantiating generic classes in Synergy .NET, there are certain restrictions on the types that can be used as type arguments. These restrictions ensure compatibility and prevent runtime errors.
-
Synergy descriptor types: Certain Synergy descriptor types are not allowed as type arguments. These include
a
,d
,d.
,I
,i1-i8
,n
,p
, orp.
, because these descriptor types represent specific data structures or behaviors that are not compatible with the generic system. -
No local structures: Local structures only exist within the scope of the method or subroutine they are declared in. This means they cannot be used outside of that scope, including as type arguments for generic classes. If you want to use a structure as a type argument, it must be declared at the global level.
-
Drop the ‘@’ in type arguments: The ‘@’ symbol is not permitted in a type argument. For example, a declaration like
@class1<@string>
is not allowed and will result in a compilation error.
Generic methods
Generics are not limited to classes and structures; they also extend to methods, delegates, and interfaces. This further allows for flexible and reusable code designs. Here’s an overview of how to define and use these generic types.
Defining a generic method
To define a generic method, you include type parameters in the method declaration. These type parameters can then be used in the method’s return type, its parameters, or within the method body.
Syntax:
public static method MyGenericMethod<T>, T
p1, T
; Method body using T
endmethod
In this example, MyGenericMethod
is a generic method with a type parameter T
. It returns an object of type T
and accepts a parameter of the same type.
Using a generic method
To use a generic method, specify the type argument when calling the method:
record
result, int
proc
result = MyGenericMethod<int>(5)
end
Here, MyGenericMethod
is called with int
as its type argument.
Generic delegates
Generic delegates are similar to generic methods but are used in scenarios where methods are passed as parameters or assigned to variables.
Defining a generic delegate
Define a generic delegate by specifying type parameters in its declaration:
Syntax:
delegate MyDelegate<T>, void
param1, T
enddelegate
MyDelegate
is a delegate that can take a method accepting a parameter of type T
.
Using a generic delegate
Assign a method that matches the delegate’s signature to an instance of the delegate:
namespace Example
class Example
public static method ExampleMethod, void
param, int
proc
; Implementation
endmethod
endclass
endnamespace
main
record
myDelegate, @MyDelegate<int>
proc
myDelegate = Example.ExampleMethod
end
In this example, ExampleMethod
is assigned to myDelegate
.
Generic interfaces
Generic interfaces allow you to define contracts with type parameters, making them versatile for various implementations.
Defining a generic interface
To define a generic interface, include type parameters in the interface declaration:
Syntax:
namespace Example
interface IMyInterface<T>
method DoSomething, void
param1, T
endmethod
endinterface
endnamespace
IMyInterface
is a generic interface with a method that operates on type T
.
Implementing a generic interface
When implementing a generic interface, specify the type argument in the implementing class:
namespace Example
class MyClass implements IMyInterface<int>
method DoSomething, void
param1, int
proc
; Implementation
endmethod
endclass
endnamespace
In this implementation, MyClass
implements IMyInterface
for the int
type.
Uniqueness based on type parameters
The uniqueness of generic methods, delegates, or interfaces is determined by their signatures, which include the number of type parameters. This means that two generic items in the same scope are considered distinct if they have a different number of type parameters. However, if they have the same number and type of parameters, they are considered duplicates, and the compiler will throw an error.
For example, two methods named MyMethod
with one type parameter each are considered duplicates, even if the names of the type parameters are different. However, MyMethod<T>
and MyMethod<T, U>
are considered unique due to the differing number of type parameters.
This rule ensures that each generic method, delegate, or interface is distinctly identifiable by its parameter structure.
Constructed types in generics
A constructed type is created by specifying actual types in place of the generic type parameters when you instantiate a generic class, method, or delegate. For example, if you have a generic class MyGenericClass<T>
, you can create a constructed type by replacing T
with a specific type like int
or string
.
Example:
namespace Example
class MyGenericClass<T>
public field, T
endclass
endnamespace
main
record
intInstance, @MyGenericClass<int>
stringInstance, @MyGenericClass<string>
proc
intInstance = new MyGenericClass<int>()
stringInstance = new MyGenericClass<string>()
end
In this example, MyGenericClass<int>
and MyGenericClass<string>
are two different constructed types of the same generic class.
Static fields in generic classes
Static fields are not shared globally across all instances of a generic class but are instead specific to each constructed type.
Sharing static fields among instances of the same constructed type
Static fields are shared among all instances of the same constructed type. This means that if you create multiple instances of a generic class with the same type arguments, they will share the same static field values.
Example:
namespace Example
class MyGenericClass<T>
public static field, int
endclass
endnamespace
main
record
instance1, @MyGenericClass<int>
instance2, @MyGenericClass<int>
proc
instance1 = new MyGenericClass<int>()
instance2 = new MyGenericClass<int>()
MyGenericClass<int>.field = 10
; Both instance1 and instance2 see MyGenericClass<int>.field as 10
end
In this scenario, instance1
and instance2
share the same static field MyGenericClass<int>.field
.
Different constructed types have separate static fields
When you instantiate a generic class with different type arguments, each constructed type has its own set of static fields. This separation ensures that static fields are relevant and specific to the type they are associated with.
Example:
record
intInstance, @MyGenericClass<int>
stringInstance, @MyGenericClass<string>
proc
intInstance = new MyGenericClass<int>()
stringInstance = new MyGenericClass<string>()
MyGenericClass<int>.field = 10
MyGenericClass<string>.field = 20
; intInstance sees MyGenericClass<int>.field as 10
; stringInstance sees MyGenericClass<string>.field as 20
In this example, intInstance
and stringInstance
do not share the same static field; they each have their own version of field
, one for MyGenericClass<int>
and one for MyGenericClass<string>
.
Lambdas
Lambdas, a concept that might initially seem daunting, are a powerful feature in modern programming languages like DBL. They allow you to write more concise and flexible code. To understand lambdas, it’s crucial to grasp two key concepts: “captured variables” and “code as data.”
What are lambdas?
Lambdas are essentially anonymous functions—functions without a name—that you can define inline (within another method) and use for short and specific tasks. They are particularly handy when you want to pass a block of code as an argument to a method or when you need a simple function for a short duration.
Captured variables
When you define a lambda inside a method, it can access variables from its enclosing scope. These are known as “captured variables.” This ability is powerful as it allows the lambda to interact with the context in which it was defined.
For example, consider a lambda defined inside a routine where it captures a local variable:
function MyFunction, void
record
localVar, int
proc
data localVar2, int, 10
localVar = 50
lambda AddToVar(x)
begin
mreturn localVar + x + localVar2
end
;; Here, the lambda 'AddToVar' captures 'localVar'
;; and 'localVar2' from the enclosing scope.
data invocable, @Func<int, int>, AddToVar
Console.WriteLine(%string(invocable(5))) ;; prints 65
end
The lambda AddToVar
can access and use the variable localVar
which is defined outside of it, in MyMethod
.
Understanding captured variables in the context of loops
Captured variables in the context of loops in DBL (or any programming language that supports lambdas and closures) can be a source of confusion, especially when it comes to understanding how they behave inside the loop. Let’s break this down for clarity.
What is a captured variable?
A captured variable is a variable that is declared in an outer scope (like a method) and is used or “captured” inside a lambda expression or anonymous function. This lambda is then able to use or modify the variable even if it executes in a different scope than where the variable was originally declared.
Capturing variables in loops
The most common issue arises when lambdas are stored (e.g., added to a list) during each iteration and executed later. Developers might expect each lambda to remember the value of i
at the time of its creation, but instead, they all reflect the final loop value.
To demonstrate this, consider the following example:
subroutine ExampleRoutine
record
i, int
funcs, @List<Func<string>>
proc
funcs = new List<Func<string>>()
for i from 1 thru 5
begin
data currentIteration = i ;; Local copy
lambda MyLambda()
begin
mreturn "Value: " + %string(currentIteration)
end
;; Store MyLambda
funcs.Add(MyLambda)
end
;; Invoke all lambdas
foreach data func in funcs
Console.WriteLine(func())
xreturn
end
When you run this code, you might expect the following output:
Value: 1
Value: 2
Value: 3
Value: 4
Value: 5
But instead you get:
Value: 5
Value: 5
Value: 5
Value: 5
Value: 5
To solve this problem, you can use a lambda factory method to create a new lambda for each iteration. This ensures that each lambda has its own copy of the captured variable.
namespace Example
class ExampleClass
public static method CreateLambdaWithValue, @Func<string>
val, int
proc
lambda MyLambda()
begin
mreturn "Value: " + %string(val)
end
mreturn MyLambda
endmethod
endclass
endnamespace
subroutine ExampleRoutine
record
i, int
funcs, @List<Func<string>>
proc
funcs = new List<Func<string>>()
for i from 1 thru 5
begin
;; Store MyLambda
funcs.Add(ExampleClass.CreateLambdaWithValue(i))
end
;; Invoke all lambdas
foreach data func in funcs
Console.WriteLine(func())
xreturn
end
Now when you run it, you get the expected output:
Value: 1
Value: 2
Value: 3
Value: 4
Value: 5
This can also be considered partial application, which is a technique that allows you to create a new function by pre-filling some of the arguments of an existing function. Currying is a functional programming technique used to transform a function with multiple arguments into a sequence of functions, each with a single argument. In essence, currying takes a function that accepts multiple parameters and breaks it down into a series of unary functions (functions with only one parameter). This is achieved by returning a new function for each argument, which captures the argument passed to it and returns a new function expecting the next argument. This process continues until all arguments are received, at which point the original function is executed with all of the captured arguments. Currying is particularly useful for creating a higher-order function that is both reusable and easily configurable. It enables partial function application, where a function that takes multiple arguments can be transformed into a chain of functions, each taking a part of the arguments, thereby creating new functions with fewer parameters. This approach enhances the flexibility and modularity of the code, allowing functions to be more dynamically composed and adapted to various contexts in a clean and intuitive manner.
Code as data
Lambdas embody the concept of treating code as data. This means you can assign a block of code to a variable, pass it as a parameter, or even return it from a function, just like you would with data.
For example:
main
proc
lambda MyLambda(x, y)
begin
mreturn x * y
end
;; Assigning lambda to a variable
data myFunc, @Func<int, int, int>, MyLambda
;; Passing lambda as an argument
SomeMethod(MyLambda)
endmain
Practical use of lambdas
Lambdas are widely used for
- Event handling: Assigning a lambda to an event makes the code more readable and concise.
- Working with collections: Operations like sorting, filtering, and transforming collections are more straightforward with lambdas.
- Asynchronous programming: Lambdas make it easier to write asynchronous code, especially with the
ASYNC
keyword.
Why use lambdas?
- Conciseness: Lambdas reduce the boilerplate code required for defining a full-fledged method.
- Clarity: Lambdas often make the code easier to read, as the functionality is defined right where it’s used.
- Flexibility: Lambdas allow for dynamic programming patterns, adapting behavior at runtime.
Lambdas might seem complex at first, especially when dealing with captured variables and the notion of code as data. However, with practice, they will become an intuitive and powerful tool in your DBL programming toolkit. Remember, lambdas are all about writing less to do more, elegantly and efficiently.
A deeper look at currying and partial application
Currying and partial application are powerful concepts in functional programming, and their utility extends to multi-paradigm languages like DBL, which support functional programming constructs. Here’s a more compelling explanation of why a developer might want to use currying or partial application:
Code reusability and abstraction
-
Modular function design: Currying allows the breakdown of a complex function with multiple arguments into a series of simpler, unary functions (functions with a single argument). This modular approach makes functions more reusable, as each unary function can be used independently across different parts of the application.
-
Function specialization: Partial application is a technique to fix a few arguments of a function and generate a new function. This is particularly useful for creating specialized functions from a general-purpose function without rewriting it. For instance, if you have a generic
add
function, you can create a new function likeaddFive
that always adds five to any given number, reusing theadd
function logic.
Enhanced flexibility and function configuration
-
Dynamic function configuration: Currying and partial application allow functions to be dynamically configured with specific arguments ahead of time. This is particularly useful in scenarios where certain parameters of a function are known in advance and remain constant throughout the application.
-
Delayed execution and function pipelines: By transforming functions into a chain of unary functions, currying naturally supports delayed execution. Functions can be set up early in the program and executed later when required, enabling the creation of sophisticated function pipelines and data flows.
Improved readability and maintenance
-
Cleaner code: By using curried functions, the code can often be made more concise and readable. It allows for the creation of higher-order functions that encapsulate specific behaviors with descriptive naming, making the codebase easier to understand and maintain.
-
Better abstraction: Currying and partial application promote a higher level of abstraction in function definitions. They help in abstracting away the repetitive parts of the code, leading to a DRY (don’t repeat yourself) codebase.
Practical use cases
-
Event handling and callbacks: In event-driven programming, currying can be used to create event handlers or callbacks that need specific data to be executed but don’t get that data until the event occurs.
-
Dependency injection: Partial application can be used for a form of dependency injection, where a function requiring several inputs gets some of its inputs (dependencies) pre-filled.
-
Creating configurable APIs: APIs that require flexible configuration can benefit from currying and partial application, allowing users to customize and configure behaviors with partial sets of arguments.
Understanding the flow of execution with lambdas
In imperative programming, code execution is generally linear and procedural. You write code in the order you expect it to run. However, in functional programming with lambdas, execution can be more abstract and deferred. Lambdas are like packaged pieces of code that you define now but execute later. They allow you to pass around behavior (code) as data.
Execution flow with lambdas
-
Defining a lambda: You define a lambda expression, but at this point, it’s just created and stored, not executed.
-
Passing a lambda around: The lambda can be passed as an argument, stored in a variable, etc. During this phase, the lambda is still not executed.
-
Triggering lambda execution: The lambda is eventually executed at a point where its behavior is needed. This could be in a different part of the code, often in response to an event or when processing a collection of data.
Consider a simple scenario where a lambda is passed to a function and then executed:
Practical Example
Consider an example with a list of numbers and a lambda for filtering:
Explanation for developers new to functional style
-
Lambdas are blueprints: Think of a lambda as a blueprint or a plan. When you define a lambda, you’re drafting a plan on how to do something, but you’re not doing it yet.
-
Execution is deferred: Lambdas don’t do anything by themselves when they’re defined. They spring into action only when called upon, which can be at a completely different time and place in your code.
-
Passing behavior: One of the strengths of lambdas is the ability to pass behavior around your application, just like you pass data. It offers a high level of abstraction and modularity.
-
Trigger point: The actual execution of a lambda usually happens inside a function or a method that accepts lambda as an argument. This function decides when to run the lambda.
Understanding this flow helps in visualizing how lambdas fit into the bigger picture of your application’s execution, making them less mysterious and more of a powerful tool in your programming toolkit.
Async
Think about making breakfast. In a traditional single-threaded approach, you might heat up the pan, wait for it to get hot, crack an egg, wait for it to cook, then start toasting bread, wait for it to brown, and finally plate everything. Each step happens one after another, and you’re basically standing there waiting at each step.
Async programming is more like how you’d actually cook breakfast: you put the pan on to heat while you get out the eggs and bread. While the egg is cooking, you start the toast. You’re not doing multiple things simultaneously yourself (that’s what true multi-threading would be), but you’re able to start one task and work on something else while waiting for it to complete.
The async
modifier on a method or lambda tells the compiler “this code will start some tasks and continue working while waiting for them to finish.” However, just marking something as async
doesn’t make it asynchronous by itself - you need to use the await
keyword to identify points where you’re waiting for something to finish.
Here’s a set of somewhat simple examples that uses an HttpClient to make a request and then waits for that request to finish before continuing:
public async method MakeTheCall, @Task
proc
endmethod
When this code hits the await
keyword, it’s like putting something in the microwave and stepping away - the method returns control to whatever called it, allowing other code to run while waiting for the download to complete. Once the download finishes, the code after the await
continues executing.
One important detail: an async method can only “pause” at an await statement. Before the first await (or if there are no awaits at all), the method runs synchronously just like any other code. That’s why you may not want to be doing time-consuming work before your first await statement - you’re still blocking the calling code during that time.
The return type of an async method is important too. In DBL, async methods must return either void
, @Task
, or @Task<>
. You can think of a Task
as a promise that work will be completed. If your method needs to return a value, you’ll use Task<YourReturnType>
, and the caller will need to await your method to get the actual value. Async methods that return void
Async void methods should generally be avoided except for one specific use case: event handlers. The reason comes down to error handling and how these methods interact with the rest of your async code.
When you have an async method that returns a Task, the caller can await that Task and catch any exceptions that occur during its execution. However, with async void methods, there’s no Task to await - meaning exceptions can’t be caught by the calling code. Instead, these exceptions end up going directly to your application’s global exception handler, which can make debugging and error handling much more difficult.
Think of it like the difference between having a safety net (async Task) versus not having one (async void). With async Task, if something goes wrong, the exception gets caught in the net and you can handle it gracefully. With async void, it’s like removing the safety net - any errors that occur will propagate up through your application in ways that are hard to predict and control.
The reason async void exists at all is historical: when async/await was added to .NET, existing event handler signatures like button1.Click += async lambda(sender, e) { ... }
needed to work with async code. Since these event handlers return void by convention, async void was created to support this specific scenario. For any other async method you write, returning a Task is almost always the better choice as it gives you proper error handling and the ability to await the operation’s completion.
The same principles apply to async lambdas - they’re just compact methods that can capture variables from their surrounding scope. They need to be marked with the async
modifier if they’re going to use await
, and they have the same restrictions on return types and what variables they can access.
The key to understanding async/await is remembering that it’s not about doing multiple things at once - it’s about being able to start something that might take a while and do other useful work while you wait for it to complete, just like an efficient cook in the kitchen.
Blocking with Wait() or Result
When you call .Result or .Wait() on a Task, you’re essentially telling your code “stop everything and wait right here until this completes.” This might seem like a simple way to get synchronous code to work with async code, but it can lead to deadlocks in certain contexts, particularly in UI applications or ASP.NET. Here’s why: Many modern frameworks, including Windows Forms, WPF, and ASP.NET, use a synchronization context to ensure certain code runs on the right thread. When you await an async method, the framework captures this context and uses it to resume the method on the correct thread. However, when you block with .Result, you’re holding onto that thread while waiting for code that might need that same thread to complete.
Let’s make this concrete with an example:
; UI Button Click Handler
public method button1_Click, void
tkSource, @object
tkArgs, @EventArgs
proc
; This is running on the UI thread
; DANGEROUS! Could deadlock because we're blocking the UI thread
data result, string
result = GetDataAsync().Result
endmethod
public async method GetDataAsync, @Task<String>
proc
; This will try to resume on the UI thread
await Task.Delay(1000) ; Simulating work
mreturn "Hello"
endmethod
This can deadlock because:
- The UI thread calls GetDataAsync().Result and blocks
- GetDataAsync tries to resume on the UI thread after the delay
- But the UI thread is blocked waiting for the Result
- Neither piece can proceed - classic deadlock
There are several ways people try to work around this we’ll start with using .ConfigureAwait(false)
in the async method:
public async method GetDataAsync, @Task<String>
proc
; Tell the runtime we don't need to resume on the original thread
await Task.Delay(1000).ConfigureAwait(false)
mreturn "Hello"
endmethod
This tells the async method “don’t try to resume on the original thread.” It can prevent the deadlock, but it’s a band-aid that can cause other issues if you actually needed that context. Next we’ll look at using GetAwaiter().GetResult()
public method button1_Click, void
tkSource, @object
tkArgs, @EventArgs
proc
; Still dangerous! Just changes how the exception is wrapped
data result, string
result = GetDataAsync().GetAwaiter().GetResult()
endmethod
This is functionally similar to .Result
but throws the original exception rather than wrapping it in an AggregateException
. It still has the same deadlock risks though. For our third option we’ll use Task.Run
to escape the current context and then block using .Result
.
If you absolutely must call async code from synchronous code (like in a console app’s Main method), you can use a dedicated async entry point or create a new context
main
proc
; Create a dedicated context for async operations
data context, @AsyncContext
context = new AsyncContext()
; Run our async code in this context
context.Run(
& lambda async()
& begin
& data result, string
& result = await GetDataAsync()
& writes(1, result)
& end)
endmain
Remember
Trying to force async code to be synchronous is like trying to push water uphill - you can do it, but it’s fighting against the natural flow and will likely cause problems. Instead, design your code to work with the async nature of modern programming as early as possible. The performance implications are worth mentioning too - blocking a thread, especially on a web server, means that thread can’t be used to handle other requests. In high-throughput scenarios, this can significantly reduce your application’s ability to scale
Extension Methods
Extension methods provide a way to add new methods to existing classes or data types without modifying their source code or inheriting from them. These methods are static methods, but they’re called as if they were instance methods on the extended type. The purpose of extension methods is to extend the capabilities of existing types in a clean and maintainable way.
Difference from traditional methods
-
Static vs. instance: Traditional methods are either instance methods (belonging to an instance of a class) or static methods (belonging to the class itself). Extension methods, while declared as static, are used like instance methods. They are a special kind of static method that can be called on an instance of a class.
-
Class modification: Traditional methods require modifying the class definition to add new methods. Extension methods, on the other hand, allow adding new methods to existing classes without altering their source code.
-
Scope of accessibility: Traditional methods can access private and protected members of the class they belong to. Extension methods can only access the public and internal members of the types they extend, as they are not part of the type’s internal implementation.
Benefits of using extension methods
-
Enhancing functionality: Extension methods enable developers to add functionality to classes for which the source code is not available, such as .NET Framework classes or third-party library classes.
-
Improved readability: They can make your code more readable and expressive. For example, instead of having a utility class with static methods, you can use extension methods to make it look like the functionality is a part of the existing type.
-
Maintainability: Since extension methods don’t modify the original class, they help in maintaining a clean separation of the original code and the extended functionalities. This separation is particularly useful when the original classes are subject to updates or are part of a library.
-
No need for inheritance: Extension methods provide an alternative to inheritance for adding functionality to a class. This is especially useful when dealing with sealed classes or when inheritance would lead to an unnecessary or overly complex class hierarchy.
-
LINQ and fluent interfaces: In .NET, extension methods are a key part of LINQ (Language Integrated Query), enabling a fluent and intuitive way of processing data.
-
Reduced boilerplate: Extension methods can reduce boilerplate code by eliminating the need for numerous utility classes. Functionalities that would typically be put into a utility class can be made into extension methods, making them more discoverable and contextually relevant.
Introduction to basic syntax
Extension methods are a special kind of static method. They are defined in a static class and have a syntax that is similar to regular static methods with some key differences. The basic structure of an extension method includes the EXTENSION
and STATIC
modifiers, followed by the method definition. Here’s a general syntax template:
public static extension method MethodName, ReturnType
parm1, ExtendedType
OtherParameters...
proc
; Method implementation
endmethod
Significance of the EXTENSION and STATIC Modifiers
-
STATIC modifier:
- Indicates that the method is static and belongs to the class rather than any individual instance.
- As with any static method, it can be called without creating an instance of the class in which it is defined.
-
EXTENSION modifier:
- This modifier is specific to extension methods.
- It signals that the method is intended to extend the functionality of the
ExtendedType
(the type of the first parameter). - This modifier enables the method to be called as if it were a method of the
ExtendedType
itself.
Structure of an extension method
-
First parameter (ExtendedType):
- The first parameter of an extension method specifies the type that the method extends.
- This parameter is crucial because it denotes the type on which the extension method can be called.
- When calling the extension method, this first parameter is not passed explicitly; instead, the method is called on an instance of the
ExtendedType
.
-
Other parameters:
- After the first parameter, additional parameters can be defined as in any standard method.
- These parameters are used as usual, and their values must be provided when the extension method is called.
-
Method body:
- Within the method body, the first parameter (
ExtendedType
) is used as if it were an instance of that type, enabling the extension functionality.
- Within the method body, the first parameter (
Example
Consider an example where we extend the string
type with a WordCount
method:
namespace MyExtensions
public static class StringExtensions
public extension static method WordCount, int
Parm1, string
proc
; Count the number of words in Parm1
mreturn NumberOfWords
endmethod
endclass
endnamespace
In this example:
WordCount
is an extension method that can be called on instances ofstring
.- It counts the number of words in the string and returns this count.
- The
Parm1
parameter represents the string instance on which theWordCount
method is called.
Role of namespaces in extension methods
-
Grouping by functionality: It’s a good practice to group extension methods into namespaces based on their functionality or the types they extend. For instance, all string-related extension methods can be in one namespace, while numeric type extensions can be in another.
-
Avoiding naming conflicts: Namespaces prevent naming conflicts, especially when you extend types that are not defined in your codebase (like standard library types). This is important because different libraries might introduce extension methods with the same name.
-
Ease of maintenance: Organizing extension methods in namespaces makes your codebase easier to navigate and maintain. It’s clear where to find and how to use these extensions.
Using the namespace in your program
To use the WordCount
method we defined earlier, include the StringExtensions
namespace at the beginning of your DBL source file:
import StringExtensions
main
record
myString, string
wordCount, int
proc
myString = "Hello, world!"
wordCount = myString.WordCount()
Console.WriteLine("Word count: " + %string(wordCount))
end
Here, the import StringExtensions
statement makes the WordCount
extension method accessible. You can then call WordCount()
on any string instance as demonstrated.
Limitations of extension methods
Extension methods offer a convenient way to add functionality to existing types, but they come with certain limitations and considerations that developers should be aware of.
-
No access to private members:
- Extension methods cannot access private or protected members of the type they are extending. They can only work with public and internal members, as they are essentially external to the type’s implementation.
-
Not part of the type’s definition:
- Extension methods are not actually part of the extended type’s definition. They are just static methods that syntactically appear to be part of the type. This means they don’t participate in polymorphism in the same way that true instance methods do.
-
No override capability:
- You cannot use extension methods to override existing methods of a type. If there’s a method with the same signature as an extension method within the type, the type’s method will take precedence.
When to use extension methods vs inheritance or composition
-
Use extension methods when…:
- Adding methods to sealed classes: If you need to add functionality to a class that is sealed (cannot be inherited), extension methods provide a way to “extend” these types.
- Enhancing types you don’t own: For types that are part of a library or framework where you don’t have the source code, extension methods allow you to add functionalities without modifying the original type.
- Creating fluent interfaces: Extension methods are useful for building fluent interfaces, where method chaining can lead to more readable code, such as in LINQ.
-
Prefer inheritance or composition when…:
- Designing a family of types: If you’re creating a family of related types, inheritance provides a better structure, allowing shared functionality and polymorphic behaviors.
- Needing access to internal state: If you need access to the private or protected members of a class, inheritance is a more suitable approach.
- Building complex functionalities: For more complex functionalities that require maintaining state or extensive interaction with various aspects of a class, composition or inheritance offers a more integrated solution.
-
Considerations for making a decision:
- Purpose and scope: Evaluate whether the functionality is a core aspect of the type (favoring inheritance/composition) or a supplementary utility (suitable for extension methods).
- Reusability and maintenance: Consider how the addition of functionalities will impact the maintainability and reusability of your classes. Extension methods can enhance reusability without affecting existing class hierarchies.
- Coupling and cohesion: Assess the level of coupling your feature requires with the existing type. Lower coupling and higher cohesion favor extension methods.
Partial Things
Partial classes and partial methods are features in some programming languages that allow a class or method to be split across multiple files or sections within a single file. This splitting enables developers to organize code more effectively, particularly in large projects. A partial class is a class definition that is broken into multiple parts, typically across different files. Each part can contain a subset of the class’s members (fields, properties, methods), and when the code is compiled, the parts are combined into a single class definition. Similarly, partial methods are method declarations within a partial class that allow for optional implementation. If a partial method is not implemented, the compiler removes the declaration, preventing any runtime overhead.
Brief history and evolution of partial constructs in programming languages
The concept of partial constructs, particularly partial classes, emerged to address the growing complexity of software development, especially in environments where code generation tools play a significant role. Languages like C# introduced partial classes and methods as part of their feature set to allow better code organization, especially in situations where a single class would otherwise become too large or unwieldy. This was particularly valuable in the context of auto-generated code, where developers needed to extend or modify generated classes without directly altering the generated code. Shortly after adding initial support for OOP, the DBL language added support for partial classes and partial methods, providing developers with powerful tools to manage large codebases and promote clean, modular design.
Partial methods
Partial methods allow developers to declare methods within a partial class without being required to implement them immediately. These methods are defined in one part of a partial class and can be optionally implemented in another part. If a partial method is not implemented, the compiler automatically removes the declaration and any calls to it, ensuring there is no runtime overhead. This feature is highly beneficial in scenarios where generated code needs to include method declarations that allow for custom extensions or hooks. Developers can provide additional logic or behaviors by implementing the partial method in a separate file, keeping the generated and custom code cleanly separated. Partial methods are often used in code generation scenarios where the framework or tool provides default behaviors that developers can optionally override or extend, making them a powerful tool for maintaining flexibility and extensibility in large, complex codebases.
Use cases in code generation
Partial constructs are particularly useful when code is generated automatically—such as in Harmony Core or SQL replication. Defining classes or methods as partial allows developers to extend or customize the generated code without modifying the generated files. This separation ensures that custom code is preserved even when the generated code is updated. Maintaining the ability to keep generated code up to date is crucial to preventing bitrot.
Syntax
In both methods and classes, the only syntax requirement is the keyword partial
.
namespace ExampleNamespace
partial class MyPartialClass
public myField, int
private partial method MaybeDoSomething, void
param1, @StringBuilder
endmethod
public method GeneratedMethod, @String
proc
;;do something interesting
data myStringBuilder, @StringBuilder, new StringBuilder()
myStringBuilder.Append("The first part")
MaybeDoSomething(myStringBuilder)
myStringBuilder.Append(" The last part)
endmethod
endclass
endnamespace
If this were the only part of MyPartialClass
, the compiler would comply and effectively skip over the undefined call to MaybeDoSomething
. Important elements here include the requirement that a partial method is both private and can only return void
. This is necessary because when there is no corresponding definition, the compiler effectively skips calling the method. If there were a return value, this silent transformation would not be possible. Let’s take a look at part 2 of a partial class definition.
namespace ExampleNamespace
partial class MyPartialClass
public myOtherField, int
private method MaybeDoSomething, void
param1, @StringBuilder
proc
param1.Append("Hello from a partial method")
endclass
endclass
endclass
Avoiding anti-patterns
While partial classes and methods offer significant advantages in managing large codebases and improving code modularity, they can also introduce challenges if misused. One of the most common misuses of partial classes is scattering related code across multiple files without a clear organizational strategy. This can lead to difficulty in understanding the full behavior of a class because its logic is fragmented. Instead of enhancing clarity, partial classes can contribute to code that is harder to navigate and maintain. To avoid this situation, developers should use partial classes with a clear intent and a consistent structure. Each part of a partial class should encapsulate logically grouped functionality, such as separating auto-generated code from custom logic, rather than splitting code arbitrarily.
Similarly, partial methods, while useful for providing extension points, can lead to confusion if overused or poorly documented. For example, declaring too many partial methods without clear documentation on their intended use can make it difficult for other developers to understand which methods are critical and which are optional. This misuse can lead to a fragmented implementation where the flow of control is obscured, making the codebase harder to maintain and debug. Developers should use partial methods sparingly and ensure their purpose is well-documented, indicating when and why a method might be implemented or left unimplemented.
Common mistakes and how to avoid them
A frequent pitfall is relying too heavily on partial methods as a means of customization, especially in scenarios where other design patterns might be more appropriate. For example, if a partial method is used to override behavior that could be better handled through polymorphism or design patterns like the strategy pattern, it can lead to code that is less flexible and harder to extend in the future. Developers should instead carefully consider whether a partial method is the best tool for the job or if other, more robust patterns would offer better long-term maintainability.
A lack of documentation is a significant issue when working with partial constructs. Because partial classes and methods inherently spread code across multiple files or sections, thorough documentation is critical to ensure that future maintainers or collaborators understand the design and intent behind the code structure. Clear comments and documentation can help mitigate the risk of misuse and ensure that partial constructs remain a tool for clarity and organization rather than a source of confusion. By adhering to best practices in documentation, developers can ensure that partial classes and methods are used effectively, enhancing rather than hindering the quality and maintainability of the codebase.
Functional Style
Functional programming (FP) is a programming paradigm centered around treating computation as the evaluation of mathematical functions and avoiding changes to state or mutable data. At its core, FP relies on a few fundamental principles. One of these is the concept of pure functions, which are functions that, given the same input, will always produce the same output without causing any side effects. This predictability makes pure functions easier to understand, test, and maintain.
Another key principle in FP is immutability, meaning that data, once created, cannot be altered. This approach to data ensures that values remain consistent throughout the program’s execution, preventing unintended side effects and making it easier to reason about code, particularly in concurrent or multi-threaded environments. FP also embraces the idea of first-class functions, treating functions as values that can be passed as arguments, returned from other functions, or assigned to variables. This allows for the creation of higher-order functions—functions that operate on other functions—enabling powerful abstractions and promoting code reuse.
One of the significant benefits of FP is its declarative nature. Unlike imperative programming, which focuses on the specific steps needed to achieve a result, declarative programming describes what the program should accomplish without explicitly defining how to do it. This shift in focus leads to more readable and concise code, reducing the cognitive load on developers and making complex logic easier to manage.
When comparing FP with imperative programming, the differences become clear. Imperative programming is characterized by a sequence of commands that change the program’s state. This approach is intuitive for tasks that involve direct manipulation of data, often seen in object-oriented programming. In contrast, FP emphasizes the application of functions without altering the state, leading to code that is more modular and easier to test. While imperative programming excels in scenarios where step-by-step procedures are necessary, FP provides powerful tools for managing complexity, particularly in applications that require high levels of abstraction, modularity, and concurrency.
FP in DBL
With its robust support for functional programming constructs such as delegates, lambdas, and higher-order functions, DBL provides a flexible environment for developers.
Delegates in DBL allow developers to create and use references to methods, enabling functions to be passed as arguments or returned as values. This facilitates a higher level of abstraction and code reuse, which are hallmarks of functional programming. Lambdas, or anonymous functions, provide a concise way to write flexible code and are particularly useful for event handling or manipulating collections. Immutable data structures are supported via the broader .NET ecosystem, a critical aspect of functional programming that ensures data consistency and helps programmers to avoid unintended side effects. Higher-order functions in DBL enable the creation of abstract and reusable code by accepting other functions as parameters or returning them as results.
As systems grow more complex, the need for maintainable, scalable, and reliable code becomes paramount. Functional programming addresses many of the challenges faced in contemporary software engineering. For example, FP’s emphasis on immutability and pure functions makes it easier to write concurrent programs, reducing the risk of common issues such as race conditions and deadlocks. Additionally, FP encourages breaking down problems into small, reusable functions, leading to more modular and maintainable codebases. Pure functions, being independent of external state, are inherently easier to test, aligning well with practices such as test-driven development (TDD). The declarative style of FP further enhances code clarity, reducing bugs and making maintenance easier. By integrating functional programming principles, DBL equips developers with the tools needed to write efficient, maintainable, and reliable software, meeting the demands of modern development environments.
Delegates in DBL
In DBL, delegates serve a role similar to function pointers in C or method references in C#. A delegate is a type that represents references to methods with a specific signature, allowing methods to be passed as arguments to other methods, assigned to variables, or returned from functions. This capability enables methods to be treated as first-class citizens within the language, fostering a higher level of abstraction and code reuse.
Delegates in DBL are implemented as a special type of object that holds a reference to a method. The delegate’s signature defines the method’s parameter types and return type, ensuring that the delegate can only be associated with methods that match this signature. For example, a delegate in DBL might be used to reference a method that prints a message to the console. This delegate can then be invoked with specific arguments, allowing for flexible and dynamic method execution.
One of the key advantages of delegates is their ability to encapsulate method references, making them particularly useful in scenarios where methods need to be passed as parameters. This is commonly seen in event handling, where delegates can encapsulate the methods that should be called in response to specific events. Delegates also play a crucial role in implementing callback functions, enabling asynchronous operations and custom workflows. By using delegates, developers can create more flexible and reusable code, encapsulating business logic that might vary depending on runtime conditions. For instance, different pricing strategies or discount calculations could be encapsulated within delegates, allowing for easy swapping of logic without altering the underlying system. Additionally, delegates enable the composition of functions, where multiple operations can be chained together, improving code readability and reuse.