Important Information: DBfuncs has been merged with DBSheets and moved to DBAddin on sourceforge.net !!!
Introduction
DBFuncs is an Excel-addin for database querying by userdefined
functions. This is opposed to the Excel integrated MS-Query, which is
integrated statically into the worksheet and has some limitations in
terms of querying possibilities and flexibility of constructing
parameterized queries (MS-Query allows parameterized queries only in
simple queries that can be displayed graphically).
Furthermore it includes the possibility for easy filling of so
called "data bound controls" (ComboBoxes and Listboxes) with data from
queries.
DBFuncs has been tested extensively (actually it's in production)
only with Excel XP and MS-SQLserver, other databases (MS Access, MySQL, Oracle,
PostgreSQL, DB2 and Sybase SQLserver) have just been tested with the
associated Testworkbook "DBFuncsTest.xls".
To
use that Testworkbook you'll need the pubs database. I have scripts available
for Oracle, Sybase, DB2, PostgreSQL and MySql here (the MS-SQLserver version can be downloaded here).
You
can also download the latest version of DBFuncs from sourceforge, it is distributed under
the GNU Public
License.
There are two principal ways to query data with DBFuncs:
A (fast) list-oriented way using DBListFetch.
Here the values are entered into a rectangular list starting from the TargetRange cell (similar to MS-Query, actually the QueryTables Object is used to fill the data into the Worksheet).
A record-oriented way using DBRowFetch
Here the values are entered into several ranges given in the Parameter list "TargetArray". Each of these ranges is filled in order of appearance with the results of the query.
These user-defined functions insert the queried data outside their
calling cell context, which means that the target ranges can be put
anywhere in the workbook (even outside of the workbook).
Additionally, some helper functions are available:
"chainCells", which concatenates the values in the
given range together by using "," as separator, thus making the
creation of the select field clause easier.
"concatCells" simply concatenating cells (making the "&" operator obsolete)
"DBinClause", building a SQL in clause from an open
ended parameter list given in the argument.
"DBString", building a quoted string from an open
ended parameter list given in the argument. This can also be used to
easily build wildcards into the String.
"DBDate", building a quoted Date string (format YYYYMMDD) from the date value given in the argument.
"MarktwertHolen", fetching Date-oriented market data into a single cell
Finally, there is a supporting tool available for building queries and placing them into DBListFetch or DBRowFetch functions (similar to MS-Query).
The select statement for querying the values is given as a text
string in parameter "Query". This text string can be a dynamic formula,
i.e. parameters are easily given by concatenating the query together
from other cells, e.g. "select * from TestTable where TestName = "&A1
The connection string is either given in the formula, or for
standard configuration can be left out and is then set globally inside
the DBfuncs.xla addin (module DBfunctions, top):
Public Const ConstConnString = "provider=SQLOLEDB,Server=..." Public Const CnnTimeout = 10
The returned list values are written into the Range denoted by "TargetRange". This can be
just any range, resulting data being copied beginning with the left-uppermost cell
a self-defined named range (of any size) as TargetRange, which
resizes the named range to the output size. This named range can be
defined (and set as function parameter) either before or after results
have been queried.
There is an additional FormulaRange that can be
specified to fill “associated” formulas (can be put anywhere (even in other
workbooks), though it only allowed outside of the data area). This FormulaRange
can be
either
a one dimensional row-like range or a
a
self-defined named range (of any size extent, columns have to include
all calculated/filled down cells) , which resizes the named range to the
output size. This named range can be defined (and set as function parameter)
either before or after results have been queried.
with formulas usually referring to cell-values fetched
within the data area. All Formulas contained in this area are filled
down to the bottom row of the TargetRange. In case the FormulaRange starts lower than the topmost row of TargetRange, then any formulas above are left untouched (e.g. enabling possibly different calculations from the rest of the data). If the FormulaRange starts above the TargetRange, then an error is given and no formulas are being refreshed down. If a FormulaRange is assigned within the data area, an error is given as well.
In case TargetRange is a named range and the FormulaRange is adjacent, the TargetRange is automatically extended to cover the FormulaRange as well. This is especially useful when using the compound TargetRange as a lookup reference (Vlookup).
The next parameter "ExtendDataArea" defines how DBListFetch should behave when the queried data extends or shortens:
0: DBListFetch just overwrites any existing data below the current TargetRange.
1: inserts cells of just the width of the TargetRange below
the current TargetRange, thus preserving any existing data. However any
data right to the target range is not shifted down along the inserted
data. Beware in combination with a FormulaRange that the cells below the FormulaRange are not shifted along in the current version !!
2: inserts whole rows below the current TargetRange,
thus preserving any existing data. Data right to the target range is
now shifted down along the inserted data. This option is working safely
for cells below the FormulaRange.
The parameter headerInfo defines whether Field Headers should be displayed (TRUE) in the returned list or not (FALSE = Default).
The parameter AutoFit defines whether Rows and Columns should be autofitted to the data content (TRUE) or not (FALSE =
Default). There is an issue with multiple autofitted target ranges
below each other, here the autofitting is not predictable (due to the
unpredictable nature of the calculation order), resulting in not fitted
columns sometimes.
The parameter AutoFormat defines whether the first data row's format information should be autofilled down to be reflected in all rows (TRUE) or not (FALSE = Default).
The parameter ShowRowNums defines whether Row numbers should be displayed in the first column (TRUE) or not (FALSE = Default).
DBRowFetch
DBRowFetch (Query, ConnectionString (optional), headerInfo(optional/ contained in paramArray), TargetRange (paramArray))
For the query and the connection string the same applies as mentioned for "DBListFetch".
The
value targets are given in an open ended parameter array after the
query, the connection string and an optional headerInfo parameter.
These parameter arguments contain ranges (either single cells or larger
ranges) that are filled sequentially in order of appearance with the
result of the query.
For example:
DBRowFetch("select job_desc, min_lvl, max_lvl, job_id from jobs " &_ "where job_id = 1",,A1,A8:A9,C8:D8)
would insert the first returned field (job_desc) of the given query
into A1, then min_lvl, max_lvl into A8 and A9 and finally job_id into
C8.
The
optional headerInfo parameter (after the query and the connection
string) defines, whether field headers should be filled into the target
areas before data is being filled.
For example:
DBRowFetch("select job_desc, min_lvl, max_lvl, job_id from jobs",_ ,TRUE,B8:E8, B9:E20)
would insert the the headers (job_desc, min_lvl, max_lvl,job_id) of the given query into B8:E8, then the data into B9:E20, row by row.
The orientation of the filled rows is always determined by the first range within the TargetRange parameter array: if this range has more columns than rows, data is filled by rows, else data is filled by columns.
For example:
DBRowFetch("select job_desc, min_lvl, max_lvl, job_id from jobs",_ ,TRUE,A5:A8, B5:I8)
would fill the same data as above (including a header), however
column-wise. Typically this first range is used as a header range in
conjunction with the headerInfo parameter.
Beware that filling of data is much slower than with DBlistFetch, so use DBRowFetch only with smaller data-sets.
DBinClause
DBinClause(ParameterList)
Creates an in clause from cell values, strings are created with quotation marks, dates are created with DBDate (see there for details, formatting is 0).
DBinClause("ABC", 1, DateRange)
Would return ”(‘ABC’,1,’20070115’)” , if DateRange contained 15/01/2007 as a date value.
chainCells
chainCells(Range)
chainCells "chains" the values in the given range together by using
"," as separator. It's use is mainly to facilitate the creation of the
select field clause in the Query parameter, e.g.
DBRowFetch("select " & chainCells(E1:E4) & " from jobs where job_id = 1",_ ,A1,A8:A9,C8:D8)
Where cells E1:E4 contain job_desc, min_lvl, max_lvl, job_id respectively.
concatCells
concatCells(Range)
concatCells concatenates the values in the given range
together. It's use is mainly to facilitate the building of very long
and complex queries:
DBRowFetch(concatCells(E1:E4),,A1,A8:A9,C8:D8)
Where cells E1:E4 contain the constituents of the query respectively.
Both chainCells and concatCells work with matrix conditionals, i.e. matrix functions of the form: {=chainCells(IF(C2:C65535="Value";A2:A65535;""))} that only chain/concat values from column A if the respective cell in column C contains "Value".
DBString
chainCells(ParameterList of Ranges)
This builds a Database compliant string (quoted) from the open ended
parameter list given in the argument. This can also be used to easily
build wildcards into the String, like
DBString("_",E1,"%")
When E1 contains "test", this results in '_test%', thus matching in
a like clause the strings 'stestString', 'atestAnotherString', etc.
DBDate
DBDate(DateValue, formatting (optional))
This builds from the date/datetime/time value given in the argument based on parameter formatting either
(default formatting = 0) A simple datestring (format 'YYYYMMDD'), datetime values are converted to 'YYYYMMDDHH:MM:SS' and time values are converted to 'HH:MM:SS'.
(formatting = 1) An ANSI compliant Date string (format date 'YYYY-MM-DD'), datetime values are converted to timestamp 'YYYY-MM-DD HH:MM:SS' and time values are converted to time time 'HH:MM:SS'.
(formatting = 2) An ODBC compliant Date string (format {d 'YYYY-MM-DD'}), datetime values are converted to {ts 'YYYY-MM-DD HH:MM:SS'} and time values are converted to {t 'HH:MM:SS'}.
An Example is give below:
DBDate(E1)
When E1 contains the excel native date 18/04/2005, this results in : '20050418' (ANSI: date '2005-04-18', ODBC: {d '2005-04-18'}).
When E1 contains the excel native date/time value 10/01/2004 08:05, this results in: '20040110 08:05:00' (ANSI: timestamp '2004-01-10 08:05:00', ODBC: {ts '2004-01-10 08:05:00'})
When E1 contains the excel native time value 08:05:05, this results in '08:05:05' (ANSI: time '08:05:05', ODBC: {t '08:05:05'})
Of course you can also change the default setting for formatting by changing the function header in Module "DBfunctions" in DBfuncs.xla:
DBDate(datVal As Date, Optional formatting As Integer = 0) As String
Plookup
Plookup(inputRange, selectionRange, targetRange)
Does a pattern weighted lookup of values in
range inputValues in pattern lookup area selectionRange, returns the values contained in found row of
range targetRange (if entered as a matrix function).
In case Plookup is not entered as a matrix
function, Plookup returns the first column of the matching row of targetRange. If more than one row matches, always return
values from first matching row.
input: 1 2 3 4 x > matches 4th row -> returns 41 42 43 input: 1 2 3 4 5 > matches 5th row -> returns 51 52 53 input: x y 3 z 5 > matches 3rd row -> returns 31 32 33 input: x 2 3 z 5 > matches both 2nd and 3rd row -> returns 21 22 23 because row 2 is more precise
Supporting Tool Query Builder
There is a supporting tool for building queries (similar to MSQuery,
in fact MSQuery is used as a frontend to build the query) available in DBfuncSupport.xla.
The query builder is invoked by right clicking anywhere in an open
excel sheet and selecting "build DBfunc query". This starts MSquery
(see associated documentation on using msquery to define queries),
building the query either in the Query-Assistant or in MS-Query itself,
the option "Back to Excel" ("Zurück an Excel" in the german version)
lets you insert the query into the active worksheet as one of the
following "DB items":
a DBListFetch function
a DBRowFetch function
a Dropdown Database form
a Listbox Database form
Following dialog is used to achieve this:
First the DB item to be inserted has to be chosen by selecting in
the "DB function/ DB bound control" choice box. Then, depending on the
above choice, the target cell for the function formula, resp. the
target cells for the parameters of the function can be selected:
Function Target: The cell where the DB function (either DBRowFetch or DBListFetch) is being placed. Available only for DBRowFetch and DBListFetch.
Data Target: For DB functions, the cell(s) where retrieved database
data is going to be placed. For DB bound controls this corresponds to
the "LinkedCell" property, which is the target cell where the chosen control value is put. Available for all DB items.
Query Target: The cell where the query is placed in case it is bigger than 255 characters. Available only for DBRowFetch and DBListFetch.
If this is explicitly set then the query is always placed there,
regardless of it's size! Also the query is placed there if it contains
quotation marks (").
ConnDef Target: The cell where the connection definition is placed
in case "use custom database setting" is chosen (used to override the
standard connection definition in DBfuncs.xla, see DBListFetch function
description above) and the connection definition string is bigger than
255 characters. Available only for DBRowFetch and DBListFetch.
Range Calc: The range where formulas that should be filled in along with data are going to be placed, Available only for DBListFetch.
Other possible choices are:
additional Data choice: The way additional Data should be treated
in DBListFetch (see explanation there). Available only for DBListFetch.
"include Header Info?": should Field Headers be included in DBListFetch (see explanation there). Available for DBListFetch DBDropDown and DBListbox .
"automatic Column Fit?": should columns be autofitted for DBListFetch (see explanation there). Available only for DBListFetch.
"automatic Format fill?": should 1st row formats be autofilled down
for DBListFetch (see explanation there). Available only for
DBListFetch.
"show row numbers?": should row numbers be displayed in 1st column
for DBListFetch? (see explanation there). Available only for
DBListFetch.
custom database setting: if a different database from the global standard connection definition (see DBListFetch function)
is used, then this is activated. Available for all DB items. Also the
ODBC provider "MSDASQL.1" is inserted automatically in front of the
rest of the connection string.
For DB bound controls the Data target corresponds to the LinkedCell of
the control. Actually only data target, "include Header Info?" and "use
custom database setting?" are available, as the query and the (possible
custom) connection definition are always put to the right of the data
target (LinkedCell).
With DB bound controls, also a "header" label (gray background, bars
as separators) is added. This is also the reason why the font in the
listbox/dropdown is always a fixed width, to make calculation and
alignment of header/date widths possible:
After creating the DB bound control both a question whether all
currently existing DB controls within the workbook should be filled now
and a hint to fill in the auto_open procedure ("Don't forget to add
'Application.Run ...' commands (in clipboard) to your workbook_open
procedure (or create one)") are displayed. A minimum auto_open
procedure is added to the clipboard which can be pasted into the
workbook's ThisWorkbook module (this uses code originally written by
Terry Kreft).
Global Connection Definition and Query Builder
There are two possibilities of connection strings: ODBC or OLEDB.
ODBC hast the advantage to seamlessly work with MS-Query, native OLEDB
is said to be faster and more reliable (there is also a generic OLEDB
over ODBC by Microsoft, which emulates OLEDB if you have just a native
ODBC driver available).
Now, if using ODBC connection strings (those
containing "Driver="), there is a straightforward way to redefine
queries directly from the cell containing the DB function: just right
click on the function cell and select "build DBfunc query". Then
MS-query will allow you to redefine the query which you can use to
overwrite the function's query.
If using OLEDB connection strings, MS-query will
try to connect using a system DSN named like the database as identified
after the DBidentifierCCS given in the global constant connection
definition:
Public Const ConstConnString = "Provider=OraOLEDB.Oracle;.. ..;User ID=pubs" Public Const DBidentifierCCS = "User ID=" Public Const DBidentifierODBC = "UID="
The DBidentifierCCS is used to identify the database within the
global constant connection definition, The DBidentifierODBC is used to
identify the database within the connection definition returned by
MS-Query (to compare and possibly allow to insert a custom connection
definition within the DB function/control). Usually these identifiers
are called "Database=" (all SQLservers, MySQL), "location="
(PostgreSQL), "User ID/UID" (oracle), "Data source=" (DB2)
DB bound controls
Creating DB bound controls
DB bound controls can be created using the supporting tool described
above, but can also be created manually following conventions:
The name property of the control must start with "DB_", required for the filling procedure (see below)
The LinkedCell property must be filled, this is needed to locate the associated query (right to the linked cell)
Correspondingly, the cell containing the query must be exactly one cell right to the LinkedCell
The optional connection definition string must be exactly two cells right to the LinkedCell
The ColumnCount property should reflect the expected column count from the query.
Filling DB bound controls
The DB bound controls are filled by calling the Macro "initDBforms" in the DBfuncSupport.xla addin the following way (preferably in an Workbook_open event proc):
Private Sub Workbook_Open() Application.Run "DBFuncs.xla!initDBforms", ThisWorkbook.Name End Sub
This walks through the supplied workbooks sheets, refreshing all
contained controls having names starting with "DB_" with the queries
contained in the cell directly right to the
LinkedCell. Additionally it checks whether the next cell right to the
query cell contains something else than the empty string and in that
case takes the content as a custom connection definition.
The DB bound control retains the selection during a refresh (e.g.
saving/reopening the workbook). Also the height and width of the DB
bound control are maintained.
Installation
Installation is simply done by copying the two Excel addins
DBFuncs.xla and DBFuncSupport.xla into an XLStart folder of your choice
(there are basically three of them:
one in C:\Programs\Microsoft Office\Office<X>\XLStart,
one in C:\Documents and Settings\<username>\Application Data\Microsoft\Excel\XLStart
and the last in the Alternative start folder you specified in Options/General).
Then you'd want to adapt the standard connection string that is
globally applied if no function-specific connection string is given.
This is done in DBFuncs.xla, module DBfunctions, on top:
Public Const ConstConnString = "provider=SQLOLEDB,Server=…. " Public Const CnnTimeout = 10
When starting the Testworkbook, after waiting for the – probable –
connection error, you have to change the connection string(s) to suit
your needs (see below for explanations).
The connection information in "DBFuncsTest.xls" are stored
to the right if the black line, the actual connection is the selected
by choosing the appropriate shortname (dropdown) in the yellow input
field. After the connection has been changed don't forget to refresh
the queries/DBforms by right clicking and selecting "refresh data".
Points of Interest
The basic principle behind returning results into an area external
to the Database query functions, is the utilisation of the calculation
event (as mentioned in and inspired by the excelmvf project, see http://www.codeproject.com/macro/excelmvf.asp for further details), as Excel won't allow ANY side-effects inside a UDF.
There are lot's of information to be carried between the function
call and the event (and back for status information). This is achieved
by utilising a so-called "calcContainer" and a "statusMsgContainer", basically being VBA classes abused as a simple structure that are stored into global collections called "allCalcContainers" and "allStatusContainers".
The references of the correct calcContainers and statusMsgContainers
are the Workbook-name, the table name and the cell address of the
calling functions which is quite a unique description of a function
call (this description is called the "callID" in the code).
Below diagram should clarify the process:
The real trick is to find out when resp. where to get rid of the
calc containers, considering Excel's intricate way of invoking
functions and the calc event handler (the above diagram is simplifying
matters a bit as the chain of invocation is by no way linear in the
calculations in the dependency tree).
Excel sometimes does additional calculations to take shortcuts and
this makes the order of invocation basically unpredictable, so you have
to take great care to just work on every function once and then remove
the calcContainer (if you're interested in the actual invocations being done, set #Const DebugMe = True in both class module "calcClass" and module "DBfunctions" to see what's going on in the direct window).
After every calculation event the working calcContainers are removed, if there are no more calcContainers left, then allCalcContainers is
reset to "Nothing", being ready for changes in input data or function
layout. Good resources for more details on the calculation
order/backgrounds is Decision Model's Excel Pages, especially
Calculation Secrets (http://www.decisionmodels.com/calcsecretsc.htm).
History
2006: First versions using CopyFromRecordSet for DBListFetch, as I found this function to be highly unreliable, changed to use WorkSheet.QueryTables from Version 2.0 instead, which copies all data correctly.