You can now create the process which is invoking the first process above and pass a value to the macro you have defined in the Context. Make sure to store it in your repository under an arbitrary name.įor more information on macros I would recommend this video here from our friends at RapidMiner Resources: You can change the value in the process' Context and re-run the proces to see how this changes the result. If you execute the process now, it will generate a data set with 5 columns (attributes) and 500 rows (examples). Use "number_of_examples" as macro name in the left cell and for example "500" as value in the right cell. Click on the plus icon on the right side which will add a new line. Those are the macros which are set for the whole process but can also be changed from the outside (which we will see in the next step). In the Context panel, you will find a section in the bottom third called "Macros". This is what we we do in the Context panel now. If you do this (at least with the latest versions of RapidMiner), it will complain that the macro is not defined. In my example, I simply use a Generate Data operator and use the macro % for the parameter defining the size of the example set. The result should look something like this:īuild the process you want to control with the value passed from the outer process (or as web service parameter). I personally prefer to have this together with the Process and the XML tab in the center of my Design view but you can put it wherever you want. Look into the menu "View" and then "Show Panel" and activate the "Context" panel in case you did not do so already. I will take the time to explain this in a bit deeper detail here since I don't think some of those hints I will give are documented elsewhere. So I only would recommend way (1) if you actually would need to transfer larger inputs. Especially if you want to export your processes as web services via RapidMiner Server so that you can invoke them through an URL. But it will require some time and efforts from you to define the right processes for your use case.Īlthough way (1) would be possible, this is too cumbersome in my opinion for short or single values. So the answer is: YES, RapidMiner can do the job. With the Web Mining extension ( ) you will also get the necessary operators to access data from online data sources automatically. There is a community member who created a nice set of tutorials for text analysis using the Text Minign extension with RapidMiner: Īnd our friends at Aylien have a great series of blog posts explaining the Aylien extension: There are also many more extensions on our Marketplace so make sure that you check them out… You can find it in the menu “Extensions” – “Marketplace” and type “Text”, "Aylien", or "Rosette" in the search box. You can download these extensions for free from our Marketplace. Well this is a bigger project and nobody here will probably be able to describe the complete solution in a single post :-)īut here are some hints to get you started:įor all types of text analysis (including sentiment analysis and other), you will need an extension for RapidMiner: Please refer to the visualisation sample processes and the RapidMiner operator manual for further details. This process demonstrates two other features of the feature operators of RapidMiner: the stop button which allows the abort of the process if the user was already satisfied and the ProcessLog operator which allows online plotting of current fitness values. In this case, a genetic algortihm for feature selection is used. Again, a cross validation building block is used as fitness evaluation. This process is very similar to the previous process. Here is the process from the "Samples" directory showing you how this looks in detail: You can also do this in a different way as long as you deliver a performance vector to the inner output port. If you double click on the selection operator, you will see the inside where you can do the performance calculation. You don't need to worry about this though since RapidMiner will automatically chose the correct direction (minimization or maximization). If it is root mean squared error, this should be minimized. This means that if you for example calculate the "accuracy" with the cross validation, this is what should be maximized. Typically you would use some validation scheme like cross validation inside of the genetic algorithm and the performance estimation delivered by the cross validation becomes the fitness function. It is the function you want to optimize for.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |