What is our primary use case?
erwin Data Modeler does conceptual, logical, and physical database or data structure capture and design, and creates a library of such things.
We use erwin Data Modeler to do all of the levels of analysis that a data architect does. We do conceptual data modeling, which is very high-level and doesn't have columns and tables. It's more concepts that the business described to us in words. We can then use the graphic interface to create boxes that contain descriptions of things and connect things together. It helps us to do a scope statement at the beginning of a project to corral what the area is that the data is going to be using.
Then we do logical data models, which are completely platform-independent. They're only about datasets, the owned attribution, and different key analyses to determine what primary keys we want.
And then we do database designs, which relates to the physical data models.
We also do reverse-engineering where we are capturing the catalogs of existing systems, or purchased software, or even external vendor datasets. They send us data sets and we can reverse-engineer what they send us, especially the backup snapshots where a vendor in the cloud will send data as a backup restore. So to help the documentation for the reporting team, we do reverse-engineering so that they know what the table and column structure look like, along with sizing, nullability, and keys and constraints.
erwin is on-prem. We have the Workgroup Edition, which means that we don't just have client-side software. We have client-side software that stores the data models back into a database which is on an on-prem server.
How has it helped my organization?
When I got to my current company two years ago, it didn't have any collection of its data assets into reporting services. If someone wanted to know where a social security number was in all the databases, they had to download all of the structures and do all of the research. I came in and did a full production-environment, reverse-engineer library. Once I did that using erwin front end, I could help the CCPA team find the PII data by simply doing Workgroup Edition Data Mart reports that crossed all of the environments.
The cool thing about that is that the erwin models will bring in data from a dozen or two dozen different platforms. But once those models are in your Mart structures, you can do your search, looking for something like names of columns, across all of them. So you could be doing a search across Oracle, PostgreSQL and, because it's in your library, you can look at your assets holistically. For us, we went from zero to 500,000 columns of information. You can do that in Excel or in other ways, but this is a very simple way to do it. And you don't need to be highly-trained and skilled. You could actually bring in a college intern and set them loose with creating those libraries for you. Not needing highly-skilled people is one of the great things about erwin. It's very intuitive and it's not hard to use.
At my current company, we're not using it for much custom work, but in my past, the solution's ability to generate database code from a model for a wide array of data sources absolutely helped to cut development time. If you do your design on paper or in an erwin model before the developers start coding, and you review them to make sure that you've got everything in there, you do much less break-and-fix. If you can have an overview model, even for your Agile developers, and say, "This is where we're going," even if you don't deploy at all, it makes it much simpler. You don't have to drop your structures and recreate and reload your test data because you're pretty confident you've gotten your database design right, before people start coding.
erwin improves your standards because your naming standards and your design standards can all be reviewed much easier. You can make sure that misspellings, for instance, don't get all the way to production, or to the point where you have to live with them because people are already coding against them. You can do so much more QA analysis on your structure before it's deployed, if you're using a model.
What is most valuable?
You could probably use something like Visio to draw boxes and lines, especially for conceptual, very high-level things. But when you're getting down to the database level, where you're building a design and you're creating DDL out of it, or you're going in the other direction where you're reaching into system catalogs and bringing things back, that starts to really require specialization. Visio isn't going to reverse-engineer that for you. Those features in erwin are valuable.
In addition, erwin has versioning so you can keep versions, over time, of those models and you can compare any version to any version. If you're looking at a specific database and you want to see what changed over time, that's really useful. You can go back to a different version or connect that to your change-control processes so you can see what was released when.
With versioning, you can also compare between development environments and production environments. You can see what may not have actually changed or what changes are in the works. It also enables you to do the kind of troubleshooting where you're looking at: Why on this server does this copy of something seem to behave differently than on that server? erwin highlights that really quickly for you. You don't have to closely eyeball your comparison. erwin creates a report that comes back and says what is different. And you can focus on almost anything, from the privileges in the catalog to a data type or a name anomaly. Even for servers that are case-sensitive in their structure, it will tell you the difference between something in all-caps and something that's mixed-case. If you're getting to that level of detail when you're troubleshooting, erwin is great at doing that sort of thing.
In terms of the solution's visual data models for helping to overcome data source complexity, erwin shows you "what is," if you're talking about the physical layer. When it comes to being able to make things clearer and more understandable, it depends on what your structure is. If you've just reverse-engineered SAP, it's abbreviated German. You may need other tools to help you understand it. If you're doing forward work — if you're going from conceptual to logical to physical — erwin is fabulous at letting you change what you see in the graphic. You can change your data model from just looking at primary keys to looking at primary keys and foreign keys, to looking at just the definition of the table in boxes. It allows you to change that visualization depending on your audience. If you're working with the DBAs, you can add metadata and it expands the box showing the visual of the table structure, so you can concentrate on just data types, or you can do data types and nullability and foreign keys, and all different sorts of things. You can do the indexes on top of it as well. You could end up with a table graphic that's the width of your screen if you've added all the details in.
And if it's too hard to look at that way — if you're trying, for instance, to make sure that EmpID Is always a varchar 250 — it also has the ability to take that graphic and move it into what's called the Bulk Editor. That looks much more like an Excel spreadsheet, within a view in your erwin model. You can sort your Excel spreadsheet by column name and see all of the details next to it. That way, everywhere EmpID shows up in that model, it is now in more of a column-row view, and you can easily look at that to make sure that all the EmpIDs say varchar 250. If you see one that's wrong, you can actually change it in the Bulk Editor and it changes it in the graphic automatically, because an erwin model really isn't a graphic, it's much more like a little Access database. So when you change it on one view, it fixes it in the other.
In addition, anybody using erwin to do forward engineering will find the solution's ability to compare and synchronize data sources with data models, in terms of the speed of keeping them in sync, to be almost instantaneous. You can connect an erwin data model to a database and deploy your changes, or you can deploy just delta changes. Or you can deploy one little piece because you've identified one little piece of your model. But most of comparing and synchronizing data sources with data models comes down to people and process. The tool will absolutely help you get there, but it's not going to take on all of the requirements of putting standards and processes in place. If you haven't tied your erwin Data Modeler to your change-control, it can't help you. So it's not a dynamic connection to your servers, it's just a tool that you can use with your environments.
Also, while I'm not configuring erwin, I do have templates that erwin lets me set up to configure models: different templates do colors and domains and prebuilt macros for definitions, based on different things. You don't have to configure erwin. You just have to tell it what sort of a platform you're either going to or coming from. You can also set up some draw templates and customize the colorization of different things. If you want all your primary keys to be red, you can configure that, and set that up as a template.
Finally, the solution's code generation ensures accurate engineering of data sources. With reverse-engineering, I have found it to be completely accurate. I've never found a time when it didn't get the source information correctly into the model. If you're doing a data warehousing project, where you're going from source to target, erwin can produce an extremely comfortable and dependable and trusted graphic of where you're coming from, while you design where you're going to. You know what the data types are, what the nullability is — the structure of the data. You don't know all the characterizations of data values because erwin is not profiling data values. It's just picking up the catalog structure of the tables. But it is completely trustworthy, once you've reverse-engineered it. It has never let me down along those lines.
What needs improvement?
One of the things I've been talking to the erwin team about through the years is that every data model should have the ability to be multi-language. So along with the fact that I can change, for example, the graphic of the model to look at just the definitions in boxes, or just the key structures in the boxes, I'd love to be able to change the language. When I was working at Honda, it became very difficult to work with the Japanese teams using just one model. You can have two models, one in English and one in Japanese, but that means you have to keep the updates back and forth, and that always increases the risk of something not being updated.
The world is getting to be a very small place, and being able to have one file that has all of that metadata in whatever form you need to read it, is the best way to manage that data. That would be a big change for them and it would be a big change to the Mart structure. It would be a one-to-many on the logical side of the business names, but it would also a one-to-many on the definition side of the tables and the columns and everything else, where you can have notes. I know that it's a big change I'm asking for, and they've had to put it off a little bit, but their business glossary tool now kind of looks at it that way. I'm hoping that the erwin model itself will be able to allow for that in the future.
For how long have I used the solution?
I've been using erwin Data Modeler from way before erwin owned it; since the '90s when it was Logic Works. That was before it went to Platinum and before it went to CA. And now they're spun off as erwin.
What do I think about the stability of the solution?
It's a very stable tool. It doesn't have problems with crashing or anything like that.
What do I think about the scalability of the solution?
I've never had a problem with its scalability, especially using the Workgroup Edition, because you keep all of your models in the database. It's not a problem to collect hundreds of different data models. Even scalability on your desktop or in your laptop would be more about the laptop itself, not the tool. It's kind of like Word. It saves the data outside of itself, so it doesn't have that problem.
There was no data modeling tool when I got here two years ago, so it is new to the culture, and this is a 40-year-old company. It is mostly being used with our master data management and our data warehousing, which is still doing a lot of development work. It's being expanded into supporting the data governance initiatives, to do data asset management. And I'm expecting that over time it will be used more for data asset change-control. We use a lot of vendor-purchased products, and being able to see the difference between their table structures before an upgrade and after an upgrade isn't being documented in a model right now, but it probably will be.
Also, the new California Consumer Privacy Act is forcing us to do much more of that data governance and data asset management, as well as data classification, so that we can identify PII data. That's definitely picking up steam.
How are customer service and technical support?
I use their technical support all the time. Sometimes it's just to ask them — because it's such a rich tool, they move menu items in the upgrades sometimes — "Okay, where did you put it this time?" But they've always been very helpful. They do have live chat on their website. About 75 percent of the time the chat agents can answer my question. If not, they hand me off to somebody. Given the amount of time I've worked with erwin, I almost know all their first names. They've always been very good and have taken care of me.
A lot of the technical staff moved with the tool, so they've stayed intact as it went through buyouts. I've always enjoyed working with the erwin team. They're very supportive, very helpful, and are very responsive to my requests and thoughts.
Which solution did I use previously and why did I switch?
My current company did not have a previous solution, other than Excel spreadsheets and Visio — nothing that I would call an industry-standard modeling tool.
How was the initial setup?
I was involved with the purchase and installation in my current company. I work with the DBAs so I don't touch the buttons for the installation. But the erwin support team is always a great help. I have never heard from any of the DBAs, during any of my "lifecycles," that installation is anything more than straightforward.
There's all sorts of bureaucracy that happens at a company, and that's true in our company as well. The deployment happened over the course of a couple of days: the installation, the tests, the verification, and making sure that the client-side could connect to the databases. I don't think any of that took too much time, other than getting everybody together to do it.
Our implementation strategy was to work with a very temporary dev environment and then roll it to a prod environment and then drop the dev environment. We don't keep a dev environment full-time because it is just a COTS tool. They do backups and restores just like any other mission-critical data. And we're using a combination of named licenses and concurrent licenses in our strategy so that we can leverage who uses it the most.
As for the number of people involved in an upgrade. I take on the SME role. We have the main DBA who is scheduling the upgrade into the environment. Then we generally have a DBA who is assigned to do the upgrade. And our service desk helps with the deployment of the client-side out to the users. So there are four people involved.
What was our ROI?
We saw return on our investment in erwin once we got our model library in place across all of our different data environments. Of course, you can always search using your DBA tools to find different things on a server. But once you've got your models in place, you can cross all the servers in your search, because you've pulled all that metadata into one place. It doesn't matter if it's an Oracle backend, an Access backend, a mission-critical Excel spreadsheet. Whatever it is that you have a model of, you can go search for something like a social security number. Just being able to do that, it almost pays for itself. When you think of how much time people spend to try to find things, it's completely amazing.
It depends on how many servers you have, how complex your environment is, and how many of your teams are going to look at stuff. If you have a really obfuscated structure, then you're actually profiling the data to figure things out.
Being able to type in, "Go find column names with SSN in them," it comes back almost immediately. That probably gets you 80 percent of the way to finding that particular aspect. How much time did we spend in the Y2K crisis just to find dates? Just identifying the columns that were going to be impacted was a feat. I keep telling my cohorts that social security number data is going to be the next Y2K. As soon as we run out of numbers, they're going to have to add a digit, but everything is hard-coded to the current span of digits. As soon as the federal government decides that it's going to do that, we are all going to have to go fix it.
The nice thing about having your assets in a database is that the more value-add you've done on your models, the less you have to look at physical names on columns. If you've put your logical or your business names on columns, that's even better.
I could imagine that in very serious research, you're going to cut 80 percent off the time it would take, depending on how complex your environment is. You can get there so much faster. Obviously, it won't give you everything because human beings just don't have it all written down. Or it could be that some nitwit is putting social security numbers into note fields and you don't know about it. But it's going to get you a long way there.
The erwin model is much more like an Access database. The return on investment is that it is a very three-dimensional type of metadata collection about your model. In some of Visio, you can add notes on a little graphic piece. But you can't add multiples. You could approximate multiples with carriage-returns in the block, but you can't categorize your metadata. You also can't add more value about that metadata. One little box on an erwin model can be opened logically and there will be 10 tabs worth of value-add you can put in. You can open the model so that you're looking at the physical side of the house, and still have another 10 tabs that have nothing to do with the logical side, other than that they share the primary key of the little graphic piece that you're looking at.
erwin is so much more flexible. And, with respect to return on investment, it's customizable. erwin has the concept of user-defined properties where if you need to do something special within your models that says something like, "Is this used by this line of business?" you can create flags, or dates, or text, or drop-down lists, and attach it to anything in the model itself. In that way you've created some value-add that is customized to your company's needs. To me that adds tremendous power to the return on investment. You can't do that with just plain drawing tools.
What's my experience with pricing, setup cost, and licensing?
We came up with a two-part concept with our licensing. Our data architects have named licenses that only they can use. We have four named licenses today. But we also bought three concurrent licenses, two that are just for developers and the DBAs, and one that's a "read-only" that anybody can use. It's a little bit difficult for me to tell you how many people use those, but probably no less than 10 and possibly upwards of 25.
We pay for maintenance on a yearly basis. There are no additional costs for the Workgroup Edition, which has the server component. That is the edition where you can save your models back to a database, which we installed on SQL Server, but I think you can install it on any of several different platforms.
Which other solutions did I evaluate?
Our company looked at two others. Because I have worked with erwin for so long, I wanted to make sure, when I came in, that my current company got the opportunity to make its choice based on what everybody's needs were here. We did a full vendor tool assessment back then. Although I don't have it in front of me, I know we looked at Embarcadero and it may be that we also did the highest level of Visio, so that between them we looked at a very high-grade tool and something that would just get us by.
When I got here, the DBAs had already put acquiring an erwin license into their next year's budget. They had already made that choice. But I took us all the way back to doing a tool compare because I wanted to make sure that everybody got the opportunity to weigh in on the choice that was made.
A lot of the difference between erwin and other products was the licensing and pricing structure for maintenance. Some of it was the inter-connectability with other tools. erwin does a really good job of building bridges between many different tools. Part of it was also its ability to be very sustainable because it had the Workgroup database backend, which Embarcadero has as well, but Visio does not. That was part of the decision point: whether we wanted to go with something really small and move up to a more industry-standard tool, or just take the opportunity to bring in a couple of licenses. We brought in a smaller footprint last year, and we added a few more licenses in 2019.
The primary reasons that erwin was selected were that it was much more affordable for us and it was easily maintainable.
What other advice do I have?
Take the time, especially if you're going to use Workgroup, but even if you're using desktops, to figure out how you're going to manage the models. They need to have a naming convention. They need to have a directory organization that makes sense to you. They need to have change-control, just like code. You need to figure out how you're going to use it because once it gets past 50 models, finding something and knowing how to change it and where to change it and where to publish it back out is going to be your biggest headache. You need to think long-term. It's easy when you just have a few models. As soon as you have 1,000 of them, unless you've thought ahead, you're going to have a huge cleanup problem.
The biggest lesson I take away from using erwin Data Modeler is that we should all be doing much better library sciences with our data assets than we do. erwin is a great tool to capture your library sciences. It can tell you what you need to know about a piece of data, or a row of data as a dataset in a table, or a collection of tables. You can add information not just about single things but collections of things.
We should have many more people whose job it is to add that value. Right now, companies still mostly use erwin for custom development and it needs to be much more built into documentation of any type of data. I use erwin to do data models of reports and of API calls, for example. Any data set, to me, qualifies as needing a model so that you can tell what data elements are in it and what that dataset is used for.
Through all the years, erwin has done a great job of making things better and better. There are always things that we're talking about in terms of improving it, but the fact that it's now starting to integrate better with data governance-type tools so that all of your definitions can move to more of a glossary form, rather than just being in the models, is tremendous. The more that that's integrated back and forth, the better it's going to be.
Out of all of the modeling tools, erwin is a 10 out of 10. It hits all the high points for me. There are some pieces of functionality that competitors come up with, maybe a little bit earlier, but it's a leapfrog-type of thing. Every time the vendors find that something is needed in the world of modelers, they all start to bring it in. I find erwin to be very responsive to those needs. So now, erwin has NoSQL modeling aspects in the tool and they're connecting with their own suite of data governance tools. That means you can push definitions to your data governance tool or bring them back from your data governance tool. It's starting to become much more of an integrated solution, rather than just a standalone.
Which deployment model are you using for this solution?