|
Managing Hit List databases, database growth, and some
speed tips.
By default Hit List creates Access97 databases from your
logfile data, with a filename extension of .mwd. For those
using Hit List Pro or Hit List Commerce the following tips
will help you maintain, protect and enhance performance of
your Access97 databases.
Enterprise and Live users have the option of using SQL Server or Oracle databases instead of Access97, tips
included below.
General Speed and Maintenance Tips, applicable to ALL Hit
List databases (regardless of format).
A) Control what logfile data you load
into the database.
Make sure you are excluding from the database all "irrelevant
data", namely GIFs, JPEGs, JPGs, internal IPs, User-Agent
strings (browser types you don't care about), HTTP Status
codes you don't care about, etc.under Options/Updates. Keeping
such data out of the database altogether will make reports
run faster, slow down database growth rates, and record the
data you want for the content you want to track the most efficiently.
For more information on using the Options/Updates tab including
the QuickList Extreme optimizations mentioned in the next
paragraph, check our Options or our Filtering tech support documents.
Enterprise and Live users also can choose from several QuickList
Extreme optimizations under Options/updates. See Help after
hitting the "quicklist extreme" button in the upper right
for more details on these features and how they can help you
maintain very large databases.
B) Controling Database growth.
1) Update the database daily, optimize frequently. Make
sure you are updating/optimizing your database on a regular
and frequent basis. A "typical" schedule (depending on your
volume, of course) is usually nightly database updates with
optimizing the database once or twice weekly. Optimizing the
database is done in the Database Manager/Tools tab.
If your webserver doesn't rollover its logs but once a month
or once a week, investigate the possibility of switching to
daily rollovers. Database updates will take less time, you
will be better equipped to back up the database if something
goes wrong, and you will be able to put in the most data possible.
2) Use Summary-only data. If you aren't doing high-level
filtering against your database, and have no need for reports
such as Path Analysis or Marketing Report (which require detail
data - you can see this by highlighting the report's icon
and checking the gray Status bar at the bottom of the main
window) you can use Summary-Only Data (see the Options tech document, Updates tab discussion for
more details). This will drastically reduce the size of the
database as compared to the logfile, grow the database MUCH
slower, and still allow further Optimization as well.
For an extended discussion of the differences between Summary
and Detail data, read our Summary vs. Detail tech document.
In some internal tests, we've loaded 800 mb of a customer's
logs as Summary-Only data, excluding GIFs, JPEGs, etc. with
a resultant 100 mb database. Optimizing the database afterward
reduced the same databsae to 60 mb in size. So you can see
that Summary-Only data can make a drastic difference.
Many Professional and Commerce customers use a "Summary-only"
database for long-term tracking while using a "Summary and
Detail" database for short-term Marketing and Path Analysis
reporting. Check out the Multiple Databases tech document
for instructions on setting this up.
C) Back your database and logs regularly -"MOST IMPORTANT"
As with any other mission critical data, regular backup of
essential files is VERY IMPORTANT.
Logs. You should back up your webserver logs (so
if you need them for additional database work you have them
ready) onto a CD-ROM, Jaz drive or similar on a frequent
basis.
Databases. You should back up your Hit List logfile
database DAILY. PROTECT YOUR DATA! We at Marketwave Tech
Support talk to customers on a frequent basis that don't
back up their databases, and then have to rebuild them in
the event of a hard drive failure, computer crash, etc.
If you want to preserve the data and don't want to rebuild
your database from what might be many megabytes of logs,
back it up!
If using Professional or Commerce, this file is usually
called Hitlist.mwd. If using Enterprise or Live, this
file would likely be the same if using Access97 databases.
If using SQL Server or Oracle databases with Enterprise
or Live, use whatever scheme you have within SQL Server
or Oracle to backup and protect the databases accordingly.
If none, consult your SQL Server or Oracle dba, who may
already be backing things up.
Reports database. You should back up your reports
database (filename = hlrpt40x.mdb, the "x" representing
which version of Hit List you have, p for Pro, c for Commerce,
e for Enterprise and l for Live). on a regular basis to
preserve custom reports you make, more often if you build
new reports fairly often or have many reports already built.
D) Reporting speed tips
Remove elements you don't care about.
With Hit List reporting, the easiest way to speed up reports
is to remove/disable elements you don't care about from
the reports. Thus if you're using Complete Analysis (or
a copy of same) and you don't care about "most popular
browsers", simply remove this element after highlighting
the report, clicking Design and scrolling down to this
element on the Outline tab. Once you delete this element
(or disable it from within its Properties) there is one
less calculation for Hit List to perform when running
this report, and it will complete quicker.
Scale back certain elements.
Many Hit List elements such as "most popular pages" allow
you to set the number of records returned to "Top 50 Items"
or Top 100 percent" - if this returns more records than
you care about (ie you only want the Top 10 Items or Top
10 "most popular pages") then scale this element back
by adjusting the "limit results to" setting in the element
at issue (highlight the report you're using, click "Design",
then go to the element's Properties/Definition tab and
make the change).
Make your Filters more efficient.
When you impose Filters on a Hit List report, either at
the Report or Element level (see our Filtering
tech document for more details) you can potentially slow
this report down, especially if you put in multiple
filters at both the report and element levels. So, after
deciding what results you need or want to see from a given
report, take the time to plan your filtering to
get the results you want more efficiently. Remember that
the "URLs within the Visit" filter can take a long time
to run in some cases.
Reduce the Time Period for the report when possible.
Remember that the more time a report covers, the more
data it must report against and thus you can speed up
reports by setting "time period" to "yesterday" or "last
week", etc. instead of "since inception" in many cases.
Certainly, there will be times when you need a complete
year-to-date report or something similar, but in many
cases you do not, and thus can reduce the Time Period
against which the report runs, and speed up reporting
as a result.
Managing Access97 databases specifically (always applies
to Professional and Commerce users, Enterprise and Live as
necessary).
1) If the database gets over 1 GB in size, it's
not clear what Access97 (or Hit List) will do. Access97
databases have an effective size limitation of 1 GB. Sometimes
everything will work, but in MOST situations it will "konk
out" and despite attempts at repair in Access97 or Hit List,
etc. you won't be able to use the database and will have
to rebuild.
So keep a close watch on the database growth rate, and
when the database has grown to 900-950 mb, begin a new one
or "cycle" as discussed below. This size limit is why the
importance of a backup CANNOT BE OVERSTATED. In the simplest
scheme, you could simply zip up the existing logfile database
(usually hitlist.mwd) on a daily basis and keep replacing
the zip daily.
2) When you get up around 850-900 mb in database
size, it's usually a good idea to "cycle" the database in
the Database Manager/Tools. This retains your IP lookup
data, the information about which logs you've already loaded,
and creates a new database to import more data into while
renaming the old. It also allows you the flexibility of
retaining summary data (provided you're using Summary and
Detail) to allow certain types of longer-term reporting
while still doing short-term reports. See Help in the Database
Manager/Tools tab for more information.
SQL Server databases
Enterprise and Live users have the option of using SQL Server
as a back-end database instead of Access97. For these users,
the following may help them keep their databases running smoothly.
1) Set up SQL Server "correctly".
In many cases, users don't consult Hit List Help under the
"Using Oracle and SQL Server" Help topic or our FAQ document
on setting up their SQL Server and then
have problems (common SQL Server/Oracle problems are outlined
in the Database in Use by another user document).
So if you're setting up SQL Server (7.0 or 6.5) for use with
Hit List for the first time or are having problems, read this
Help topic in addition to the document noted above.
2) Configuration of the machine.
Here's a few configuration options you may adjust as your
requirements vary, your website grows or you upgrade your
SQL Server/Hit List machine:
Memory - in both SQL Server 6.5 and 7.0, you can adjust
how much RAM SQL Server gets to exclusively use. The usual
recommendation in most setups is slightly less than half your
RAM should be allocated to SQL Server (i.e. 225 mb RAM to
SServer on a 512 mb RAM machine). Updating will go quicker
if you give more memory to SQL Server, but then reporting
might be slower. And the converse is true, that if you take
away RAM from SQL Server, thus leaving it for Hit List and
NT, then reporting will go quicker but updating slower. So,
a bit of experimentation might be in order if you have specific
needs - for example, if you have plenty of time available
for updating (overnight) but you don't for reporting, you
may want to scale back the RAM SQL Server gets to allow reporting
to run quicker.
NOTE: There probably will be a point where either Hit List
or SQL Server will end up with too little RAM to be effective
(i.e. don't give SQL Server 12 mb RAM and Hit List/NT 500
mb RAM) so these adjustments should not be too extreme.
Hard Drives and RAM - SQL Server (6.5 or 7.0, especially
7.0) will ultimately run fastest on either very fast SCSI
drives or a RAID configuration, and if you are processing
several hundred mb of logs nightly, you should consider such
a setup. In a similar vein, the more RAM, the better. There
have been a few cases of people trying to run several hundred
mb of logs into a 64 mb RAM, IDE SQL Server setup - this will
not suffice!
3) Evaluate the "general" database tips at the top of
this document and USE them.
Many customers think that no adjustment needs to be done
or no regular "maintenance" of the database files is important
- THIS IS NOT TRUE. These maintenance items were added to
the product to enhance use and provide flexibility whenever
possible - they are designed to help you!
Oracle Server databases
As of this writing, the only recommendations beyond the above
suggestions (that apply to either all databases or SQL Server
databases) would be to make sure that your network connectivity
is stable and constant (can sometimes be a problem for Oracle
Unix users) and that your Oracle dba has allocated sufficient
table space for the log(s) you want to load into the Oracle
database. Make sure that person or persons know that you want
to process a lot of data and thus usually will require a lot
of space in the Oracle setup as a result.
To examine possible connectivity issues with Oracle and Hit
List, see our Database in Use by another user document.
|